intermediate 30 minutes

T-Tests: Comparing Means

Master one-sample, two-sample, and paired t-tests. Learn when to use each test and how to interpret results.

On This Page
Advertisement

Overview of T-Tests

T-tests are hypothesis tests for comparing means when the population standard deviation is unknown (which is almost always!).

Test TypePurposeExample
One-sample t-testCompare sample mean to hypothesized valueIs average IQ different from 100?
Independent two-sample t-testCompare means of two independent groupsDo men and women have different avg heights?
Paired t-testCompare means of matched/paired observationsDid scores improve from pre-test to post-test?

One-Sample T-Test

Tests whether a population mean differs from a specific value.

Hypotheses

  • H₀: μ = μ₀
  • H₁: μ ≠ μ₀ (or < or >)
One-Sample T-Statistic

t=xˉμ0s/nt = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}

With df = n - 1

Assumptions

  1. Random sample from population
  2. Independence of observations
  3. Normality: Population is normal OR n ≥ 30
One-Sample T-Test

A manufacturer claims light bulbs last 1000 hours. A sample of 25 bulbs has:

  • Mean: 985 hours
  • SD: 40 hours

Test at α = 0.05 if the mean differs from the claim.

Hypotheses:

  • H₀: μ = 1000
  • H₁: μ ≠ 1000 (two-tailed)

Test statistic: t=985100040/25=158=1.875t = \frac{985 - 1000}{40/\sqrt{25}} = \frac{-15}{8} = -1.875

Critical value: df = 24, t₀.₀₂₅ = ±2.064

P-value: Between 0.05 and 0.10 (approximately 0.073)

Decision: Since |-1.875| < 2.064 (or p > 0.05), fail to reject H₀.

Conclusion: There is not sufficient evidence at α = 0.05 that the mean lifetime differs from 1000 hours.


Independent Two-Sample T-Test

Compares means of two separate, unrelated groups.

Hypotheses

  • H₀: μ₁ = μ₂ (or μ₁ - μ₂ = 0)
  • H₁: μ₁ ≠ μ₂ (or one-tailed alternatives)
Two-Sample T-Statistic (Pooled)

When equal variances are assumed:

t=(xˉ1xˉ2)0sp1n1+1n2t = \frac{(\bar{x}_1 - \bar{x}_2) - 0}{s_p\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}

Where the pooled standard deviation is: sp=(n11)s12+(n21)s22n1+n22s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1 + n_2 - 2}}

With df = n₁ + n₂ - 2

Two-Sample T-Statistic (Welch's)

When variances are NOT assumed equal:

t=xˉ1xˉ2s12n1+s22n2t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}

With adjusted df (Welch-Satterthwaite formula)

Assumptions

  1. Independent samples from two populations
  2. Random sampling
  3. Normality in each group OR large samples
  4. Equal variances (for pooled t-test) - check with F-test or Levene’s test
Two-Sample T-Test

Compare test scores of two teaching methods:

Method AMethod B
n₁ = 30n₂ = 35
Mean = 78Mean = 82
SD = 10SD = 12

Test at α = 0.05 if there’s a difference.

Hypotheses:

  • H₀: μ₁ = μ₂
  • H₁: μ₁ ≠ μ₂

Using Welch’s t-test: Standard Error = 102/30+122/35=3.33+4.11=7.44=2.73\sqrt{10^2/30 + 12^2/35} = \sqrt{3.33 + 4.11} = \sqrt{7.44} = 2.73

t=78822.73=42.73=1.47t = \frac{78 - 82}{2.73} = \frac{-4}{2.73} = -1.47

df (Welch): approximately 62

P-value: approximately 0.147 (two-tailed)

Decision: Since p = 0.147 > 0.05, fail to reject H₀.

Conclusion: There is not sufficient evidence of a difference in mean scores between the two methods.


Paired T-Test

Compares two measurements on the same subjects or matched pairs.

When to Use Paired T-Test

  • Before/after measurements on same subjects
  • Twin studies
  • Matched case-control studies
  • Left vs right measurements on same person

Hypotheses

  • H₀: μ_d = 0 (no mean difference)
  • H₁: μ_d ≠ 0 (or one-tailed)
Paired T-Statistic

t=dˉ0sd/nt = \frac{\bar{d} - 0}{s_d/\sqrt{n}}

Where:

  • dˉ\bar{d} = mean of differences
  • sds_d = standard deviation of differences
  • n = number of pairs

With df = n - 1

Assumptions

  1. Paired data (natural pairing)
  2. Random sample of pairs
  3. Normality of differences OR large n
Paired T-Test

A weight loss program measures 10 participants before and after:

SubjectBeforeAfterDifference (d)
1180175-5
2220212-8
3195190-5
4185183-2
5240231-9
6170168-2
7200191-9
8175172-3
9210205-5
10190184-6

Summary of differences: dˉ\bar{d} = -5.4, sds_d = 2.59

Test at α = 0.05 if the program produces weight loss.

Hypotheses:

  • H₀: μ_d = 0 (no change)
  • H₁: μ_d < 0 (weight decreased) - left-tailed

Test statistic: t=5.402.59/10=5.40.819=6.59t = \frac{-5.4 - 0}{2.59/\sqrt{10}} = \frac{-5.4}{0.819} = -6.59

Critical value: df = 9, t₀.₀₅ = -1.833

P-value: < 0.0001

Decision: Since t = -6.59 < -1.833, reject H₀.

Conclusion: There is significant evidence that the weight loss program is effective.


Choosing the Right T-Test

SituationTest
One sample, compare to specific valueOne-sample t
Two separate groupsIndependent two-sample t
Same subjects measured twicePaired t
Before/after on same subjectsPaired t
Treatment vs control (different people)Independent two-sample t

Effect Size: Cohen’s d

P-values don’t tell you how big the effect is. Use Cohen’s d for effect size.

Cohen's d

For one-sample: d=xˉμ0sd = \frac{\bar{x} - \mu_0}{s}

For two-sample: d=xˉ1xˉ2spd = \frac{\bar{x}_1 - \bar{x}_2}{s_p}

For paired: d=dˉsdd = \frac{\bar{d}}{s_d}

Cohen’s dInterpretation
0.2Small effect
0.5Medium effect
0.8Large effect
Effect Size for Weight Loss

From the paired example: d=5.42.59=2.08d = \frac{-5.4}{2.59} = -2.08

This is a very large effect (|d| > 0.8).

Both statistically significant AND practically meaningful!


Confidence Intervals from T-Tests

Every t-test can produce a confidence interval:

CI for Difference in Means

(xˉ1xˉ2)±tα/2×SE(\bar{x}_1 - \bar{x}_2) \pm t_{\alpha/2} \times SE

95% CI from Two-Sample Test

From the teaching methods example:

Difference = 78 - 82 = -4 SE = 2.73 t₀.₀₂₅,₆₂ ≈ 2.00

95% CI: -4 ± 2.00(2.73) = -4 ± 5.46 = (-9.46, 1.46)

Since this interval includes 0, we can’t conclude the means differ (consistent with failing to reject H₀).


Assumptions Violations


Summary

In this lesson, you learned:

  • One-sample t-test: Compare sample mean to hypothesized value
  • Two-sample t-test: Compare means of independent groups
  • Paired t-test: Compare paired/matched observations
  • Use Welch’s t-test when unsure about equal variances
  • Effect size (Cohen’s d) measures practical significance
  • Confidence intervals complement hypothesis tests
  • Always check assumptions: normality, independence, (equal variances)

Practice Problems

1. A sample of 16 students has mean GPA 3.2 with SD 0.5. Test at α = 0.05 if the mean differs from 3.0.

2. Compare two groups:

  • Group A: n = 20, mean = 45, SD = 8
  • Group B: n = 25, mean = 50, SD = 10

Test at α = 0.05 if there’s a significant difference.

3. Eight patients’ blood pressure before and after medication:

  • Before: 145, 150, 138, 155, 142, 148, 152, 140
  • After: 140, 142, 135, 148, 138, 145, 147, 136

Test at α = 0.05 if the medication lowered blood pressure.

4. For problem 3, calculate Cohen’s d and interpret the effect size.

Click to see answers

1. One-sample t-test

  • t = (3.2 - 3.0)/(0.5/√16) = 0.2/0.125 = 1.6
  • df = 15, critical t = ±2.131
  • |1.6| < 2.131, fail to reject H₀
  • Not enough evidence mean differs from 3.0

2. Two-sample t-test (Welch’s)

  • SE = √(64/20 + 100/25) = √(3.2 + 4) = √7.2 = 2.68
  • t = (45 - 50)/2.68 = -1.87
  • df ≈ 42, p ≈ 0.068
  • Fail to reject H₀ (p > 0.05)

3. Paired t-test

  • Differences: -5, -8, -3, -7, -4, -3, -5, -4
  • Mean d = -4.875, SD_d = 1.73
  • t = -4.875/(1.73/√8) = -4.875/0.612 = -7.97
  • df = 7, critical t = -1.895 (one-tailed)
  • Reject H₀, medication significantly lowered BP

4.

  • d = -4.875/1.73 = -2.82
  • This is a very large effect (|d| > 0.8)
  • Both statistically and practically significant

Next Steps

Continue with hypothesis testing:

Advertisement

Was this lesson helpful?

Help us improve by sharing your feedback or spreading the word.