Test your knowledge with free interactive questions on Seneca — used by over 10 million students.

Correlation and Causation

Correlation means that there is a relationship between two variables. Causation means that changing one causes the other to change.

Causation

Causation

  • Correlation does not imply causation.
  • If two variables are correlated that does not mean that changes in one directly cause changes in the other.
Example

Example

  • The number of ice cream sales might be positively correlated with temperature but this does not mean that ice cream sales cause higher temperatures.

Pearson's Linear Correlation

Two variables are correlated if you can draw a straight line that lies close to most of the points.

Variables and constants

Variables and constants

  • The variable x is the independent or explanatory variable.
  • The variable y is the dependent or response variable.
  • The slope of the regression line is given by the constant b.
    • The sign and size of this number tell us how y changes for every unit increase in x on average.
Pearson's correlation coefficient

Pearson's correlation coefficient

  • The Pearson's correlation coefficient, rr, provides a measure of strength and direction of the correlation between x and y.
  • The value of rr is always between –1 and +1.
Pearson's correlation coefficient

Pearson's correlation coefficient

  • Values of rr close to –1 or to +1 indicate a stronger linear relationship between x and y.
    • If r=0r = 0, there is likely no linear correlation.
    • If r=1r = 1, there is a perfect positive correlation so all points lie on a straight line.
    • If r=-1r = \text{-}1, there is a perfect negative correlation so all points lie on a straight line.
Formula

Formula

  • The formula for the correlation coefficient is:
    • r=1n1(xixsx)(yiysy)r = \frac{\large 1}{\large n-1}\sum\left(\frac{\large x_i-\overline{x}}{\large s_x}\right)\left(\frac{\large y_i-\overline{y}}{\large s_y}\right)
  • Where x,y\overline{x},\overline{y} are the mean and sx,sys_x,s_y are the standard deviations of the explanatory and response variables respectively.
  • It is much more common to calculate rr using a calculator.
Jump to other topics
1

Cell Structure

2

Biological Molecules

3

Enzymes

4

Cell Membranes & Transport

5

The Mitotic Cell Cycle

6

Nucleic Acids & Protein Synthesis

7

Transport in Plants

8

Transport in Mammals

9

Gas Exchange

10

Infectious Diseases

11

Immunity

12

Energy & Respiration (A2 Only)

13

Photosynthesis (A2 Only)

14

Homeostasis (A2 Only)

15

Control & Coordination (A2 Only)

16

Inherited Change (A2 Only)

17

Selection & Evolution (A2 Only)

18

Classification & Conservation (A2 Only)

19

Genetic Technology (A2 Only)

Practice questions on Correlation

Can you answer these? Test yourself with free interactive practice on Seneca — used by over 10 million students.

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
Answer all questions on Correlation

Unlock your full potential with Seneca Premium

  • Unlimited access to 10,000+ open-ended exam questions

  • Mini-mock exams based on your study history

  • Unlock 800+ premium courses & e-books

Get started with Seneca Premium