Thursday, January 8, 2009

Why numerical measures?

So far we used scatterplots to study the relationships between two quantitative variables, we studied the overall pattern of the relationship by looking at its direction (positive, negative, or neither), form (linear, curvilinear etc), and strength (eg. stronger, weaker).

We also noted that assessing the strength of a relationship just by looking at the scatterplot is quite hard, and therefore we need to supplement the scatterplot with some kind of numerical measure that will help us assess the strength, this numerical measure we are going to learn.

Lets have a look at following two images (from the course website) to see why numerical measures are required along with scatterplot to assess strength of linear relationship.
We can see that in both cases, the direction of the relationship is positive and the form of the relationship is linear. What about the strength? Recall that the strength of a relationship is the extent to which the data follow its form.

At first glance it looks like strength of the first graph is stronger but at course website clarifies, both graphs are for same data, just drawn using two different scales!

The purpose of this example was to illustrate how assessing the strength of the linear relationship from a scatterplot alone is problematic, since our judgment might be affected by the range of values that are plotted. This example, therefore, provides a motivation for the need to supplement the scatterplot with a numerical summary that will measure the strength of the linear relationship between two quantitative variables.