Imagine that you studied for 2 hours for a math test and received a score of a 99. The next time you take a math test, you study for 30 minutes and receive a score of an 83.
You wonder if there is a connection or relationship between the amount of time spent studying and the grade received on the math test.
By asking your classmates, you are able to find out the amount of time studied by each student in your class, and their grade on the recent math test. The results are shown in the table below.
We can turn this data into a scatterplot to better see the relationship, or correlation, between time spent studying and test score.
A scatterplot is a graph that has a horizontal and vertical axis.
The horizontal axis consists of the indepedent variable, or the thing that we change or control. In this case, the independent variable is the amount of time spent studying, since that is something we can control and change.
The vertical axis consists of the dependent variable, which depends on the independent variable. The dependent variable is also what we measure. In this case, the dependent variable is the test score, as the test score is what we are measuring, the thing we cannot control.
We can create our graph with a horizontal and vertical axis as shown below.
Now we can plot each of our data points from our table onto the scatterplot.
What can we observe from this scatterplot?
Looking at our points on the scatterplot, we can see that, in general, the longer time spent studying, the higher test grade one receives. However, some people may study the same amount and receive different test scores. For example, the 3 students who each studied for 0 minutes all received different scores.
Seeing Relationships in the Data
We can draw a line of best fit through the data points in order to better see if there is a relationship between our variables: minutes spent studying and grade on math test.
A line of best fit is a line that follows the general direction of the data points - it "best fits" in between them. Ideally there should be about the same number of data points to the left and right of the line.
We can use a line of best fit to predict different data points. For example, we can see that the point (100, 100) lines on the line of best fit. This would predict that if you studied 100 minutes, you would receive a 100 on your test.
Scatterplots are used to help us see relationships, or correlation, between variables.
In our example, we were trying to figure out if there is a relationship between minutes spent studying and score on a math test.
The line of best fit can help us determine if there is a relationship.
There are 3 specific types of relationships or correlations that we look for.
1) Positive correlation
Positive correlation means that the two variables are directly related to one another.
If the independent variable increases, the dependent variable will increase as well.
Similarly, if the independent variable decreases, so will the dependent variable.
For example, the more food you feed your puppy, the more it will weigh.
This scatterplot has a positive correlation because the line of best fit points upwards.
2) Negative correlation
Negative correlation means that the two variables are inversely related to one another.
This means that if the independent variable increases, the dependent variable will decrease.
Similarly, if the independent variable decreases, the dependent variable will increase.
For example, the more times you are absent, the lower your class grade is likely to be.
This scatterplot has a negative correlation because the line of best fit points downwards.
3) No correlation
Sometimes it is not easy to see any relationship between the variables on a scatterplot, and it might feel impossible to try to draw a line of best fit through the points.
In this case, we can say there is no correlation, or no relationship between the variables.
For example, the month you were born in has no correlation with the number of siblings you have.
Try to draw a line of best fit through this scatterplot. It is very difficult and it doesn't look like the dots are moving in one particular direction.
That means that this scatterplot has no correlation.