Pearson Correlation Coefficient Explained

The Pearson correlation coefficient (r) measures the strength and direction of the linear relationship between two variables. It ranges from −1 to +1, and it’s one of the most commonly reported statistics in research.

What Does r Tell You?

r value	Meaning
+1.0	Perfect positive linear relationship
+0.7 to +0.9	Strong positive
+0.4 to +0.6	Moderate positive
+0.1 to +0.3	Weak positive
0	No linear relationship
−0.1 to −0.3	Weak negative
−0.4 to −0.6	Moderate negative
−0.7 to −0.9	Strong negative
−1.0	Perfect negative linear relationship

Key insight: r = 0 doesn’t mean “no relationship” — it means no linear relationship. Two variables can have a perfect curvilinear relationship and still show r ≈ 0.

The Formula

$r = \frac{n\sum xy - \sum x \sum y}{\sqrt{[n\sum x^2 - (\sum x)^2][n\sum y^2 - (\sum y)^2]}}$

That looks intimidating, but our Correlation Calculator handles it instantly. Let’s walk through a manual example to build intuition.

Worked Example: Study Hours vs. Exam Score

Student	Hours studied (x)	Exam score (y)
1	2	65
2	4	72
3	5	80
4	6	78
5	8	90
6	10	95

Step 1: Calculate the sums

$n = 6$
$\sum x = 2+4+5+6+8+10 = 35$
$\sum y = 65+72+80+78+90+95 = 480$
$\sum xy = 130+288+400+468+720+950 = 2956$
$\sum x^2 = 4+16+25+36+64+100 = 245$
$\sum y^2 = 4225+5184+6400+6084+8100+9025 = 39018$

Step 2: Plug into the formula

$r = \frac{6(2956) - (35)(480)}{\sqrt{[6(245) - 35^2][6(39018) - 480^2]}}$

$r = \frac{17736 - 16800}{\sqrt{[1470 - 1225][234108 - 230400]}}$

$r = \frac{936}{\sqrt{245 \times 3708}} = \frac{936}{\sqrt{908460}} = \frac{936}{953.13} = 0.982$

Step 3: Interpret

r = 0.982 — a very strong positive correlation. As study hours increase, exam scores increase almost proportionally.

R² — The Coefficient of Determination

Square the correlation to get $R^2$ :

$R^2 = 0.982^2 = 0.964$

This means 96.4% of the variation in exam scores can be explained by variation in study hours. That’s a very high explanatory power.

R²	Interpretation
> 0.75	Strong explanatory power
0.50 – 0.75	Moderate
0.25 – 0.50	Weak
< 0.25	Very weak

Testing if r Is Significant

Just because you computed r ≠ 0 doesn’t mean the true population correlation is non-zero. Convert r to a t-statistic:

$t = \frac{r\sqrt{n-2}}{\sqrt{1-r^2}}$

For our example:

$t = \frac{0.982\sqrt{4}}{\sqrt{1 - 0.964}} = \frac{0.982 \times 2}{0.190} = 10.34$

With df = n − 2 = 4, check the t-table: the critical value at α = 0.05 (two-tailed) is 2.776. Since 10.34 > 2.776, the correlation is statistically significant.

Assumptions of Pearson’s r

Both variables are continuous — for ranked data, use Spearman’s ρ instead
Linear relationship — always plot your data first!
No extreme outliers — a single outlier can dramatically shift r
Approximately normally distributed — especially for significance testing
Homoscedasticity — spread around the regression line should be roughly constant

Correlation ≠ Causation

This is the most important caveat. A strong correlation between study hours and exam scores doesn’t prove that studying causes better scores. It could be:

Reverse causation: Students who understand the material easily study more because they enjoy it
Confounding variable: Motivation drives both studying and performance
Selection bias: Only students who studied showed up for the exam

To establish causation, you need a controlled experiment — not just correlation.

Common Pitfalls

1. Restricting the range

If you only look at students who studied 7-10 hours, the correlation drops because you’ve eliminated most of the variation. Always use the full range of your data.

2. Combining groups

Correlating data across different groups (male and female, different age groups) can create a misleading correlation. Check within each group.

3. Ignoring non-linearity

If the relationship is curved (like the dose-response in medicine), r will underestimate the true strength. Plot your data.

Calculate It Now

Enter your data into the Pearson Correlation Calculator to get r, R², p-value, and a scatter plot instantly.

Pearson Correlation Coefficient Explained

What Does r Tell You?

The Formula

Worked Example: Study Hours vs. Exam Score

Step 1: Calculate the sums

Step 2: Plug into the formula

Step 3: Interpret

R² — The Coefficient of Determination

Testing if r Is Significant

Assumptions of Pearson’s r

Correlation ≠ Causation

Common Pitfalls

1. Restricting the range

2. Combining groups

3. Ignoring non-linearity

Calculate It Now

Free Statistics Cheat Sheet

Related Articles

Hypothesis Testing: A Beginner's Complete Guide

The Normal Distribution Explained Simply

How to Read Statistical Tables (Z, T, F, χ²)

Pearson Correlation Coefficient Explained

What Does r Tell You?

The Formula

Worked Example: Study Hours vs. Exam Score

Step 1: Calculate the sums

Step 2: Plug into the formula

Step 3: Interpret

R² — The Coefficient of Determination

Testing if r Is Significant

Assumptions of Pearson’s r

Correlation ≠ Causation

Common Pitfalls

1. Restricting the range

2. Combining groups

3. Ignoring non-linearity

Calculate It Now

Related Reading

Free Statistics Cheat Sheet

Related Articles

Hypothesis Testing: A Beginner's Complete Guide

The Normal Distribution Explained Simply

How to Read Statistical Tables (Z, T, F, χ²)