advanced 25 minutes

Effect Size and Statistical Power

Go beyond p-values to understand practical significance. Learn effect size measures, power analysis, and sample size planning.

On This Page
Advertisement

The Limitations of P-Values

P-Value Misleading Example

Study 1: Blood pressure drug reduces BP by 0.5 mmHg, p < 0.001, n = 100,000

Study 2: Blood pressure drug reduces BP by 15 mmHg, p = 0.08, n = 20

Which is more practically important? Study 2!

The p-value alone is misleading.


What is Effect Size?

Effect size measures the magnitude of an effect, independent of sample size.

P-ValueEffect Size
Is there an effect?How big is the effect?
Depends on nIndependent of n
Statistical significancePractical significance

Common Effect Size Measures

1. Cohen’s d (Standardized Mean Difference)

Cohen's d

d=xˉ1xˉ2spooledd = \frac{\bar{x}_1 - \bar{x}_2}{s_{pooled}}

For one-sample: d=xˉμ0sd = \frac{\bar{x} - \mu_0}{s}

For paired: d=dˉsdd = \frac{\bar{d}}{s_d}

Cohen’s dInterpretation
0.2Small effect
0.5Medium effect
0.8Large effect
Calculating Cohen's d

Treatment group: Mean = 75, n = 30 Control group: Mean = 70, n = 30 Pooled SD = 10

d=757010=0.5d = \frac{75 - 70}{10} = 0.5

Medium effect: Treatment improves scores by half a standard deviation.

2. Correlation Coefficient (r)

rInterpretation
0.1Small
0.3Medium
0.5Large

3. Eta-Squared and Partial Eta-Squared (ANOVA)

Eta-Squared

η2=SSbetweenSStotal\eta^2 = \frac{SS_{between}}{SS_{total}}

η²Interpretation
0.01Small
0.06Medium
0.14Large

4. Odds Ratio (OR)

For binary outcomes and logistic regression.

ORInterpretation
1.5Small effect
2.5Medium effect
4.0Large effect

Statistical Power

Power is the probability of detecting a real effect when it exists.

Power Definition

Power=1β=P(reject H0H0 is false)\text{Power} = 1 - \beta = P(\text{reject } H_0 | H_0 \text{ is false})

  • α = P(Type I error) = false positive rate
  • β = P(Type II error) = false negative rate
  • Power = 1 - β = correct detection rate

Factors Affecting Power

FactorEffect on Power
↑ Sample size↑ Power
↑ Effect size↑ Power
↑ α level↑ Power
↓ Variability↑ Power
One-tailed test↑ Power (vs two-tailed)
Power and Sample Size

To detect d = 0.5 with α = 0.05:

n per groupPower
200.34
400.60
640.80
1000.94

Need about 64 per group for 80% power.


Power Analysis

A Priori Power Analysis (Planning)

Before collecting data: How many subjects do I need?

Sample Size Estimation

For comparing two means with equal groups:

n=2(z1α/2+z1β)2d2n = \frac{2(z_{1-\alpha/2} + z_{1-\beta})^2}{d^2}

Where d = expected Cohen’s d

Sample Size Calculation

Want to detect medium effect (d = 0.5) with:

  • Power = 0.80 (z = 0.84)
  • α = 0.05 two-tailed (z = 1.96)

n=2(1.96+0.84)20.52=2(7.84)0.25=15.680.25=62.7n = \frac{2(1.96 + 0.84)^2}{0.5^2} = \frac{2(7.84)}{0.25} = \frac{15.68}{0.25} = 62.7

Need about 63 per group (126 total).

Post-Hoc Power Analysis (After the Fact)


Sample Size Tables

Two Independent Groups (d = expected effect)

dα = .05, Power = .80α = .05, Power = .90
0.2393 per group526 per group
0.564 per group85 per group
0.826 per group34 per group

Paired Samples (d = expected effect)

dα = .05, Power = .80α = .05, Power = .90
0.2199 pairs265 pairs
0.534 pairs44 pairs
0.815 pairs19 pairs

Reporting Effect Sizes


Converting Between Effect Sizes

Effect Size Conversions

Between Cohen’s d and r: r=dd2+4r = \frac{d}{\sqrt{d^2 + 4}}

d=2r1r2d = \frac{2r}{\sqrt{1-r^2}}

Cohen’s dr
0.200.10
0.500.24
0.800.37

Practical vs Statistical Significance

Statistical SignificancePractical Significance
p < αEffect size is meaningful
Based on sampleBased on context
Can be trivial with large nRequires judgment
The Full Picture

Study Results:

  • p = 0.02 (statistically significant)
  • d = 0.15 (small effect)
  • 95% CI for d: [0.02, 0.28]

Interpretation: While statistically significant, the effect is small. The intervention produces only a 0.15 SD improvement. Consider whether this small gain justifies the cost and effort.


Summary

In this lesson, you learned:

  • P-values indicate existence of effect, not importance
  • Effect size measures magnitude independent of sample size
  • Cohen’s d: 0.2 (small), 0.5 (medium), 0.8 (large)
  • Power = P(detecting a real effect) = 1 - β
  • 80% power is conventional minimum
  • Power increases with n, effect size, and α
  • A priori power analysis determines needed sample size
  • Always report effect size and confidence intervals

Practice Problems

1. Treatment vs control: Mean difference = 8, pooled SD = 20. Calculate Cohen’s d and interpret.

2. A study with n = 500 per group finds p = 0.001 but d = 0.15. Interpret this finding.

3. You want 80% power to detect d = 0.4 at α = 0.05. Approximately how many subjects per group do you need?

4. Study A: p = 0.04, d = 0.8, n = 25 Study B: p = 0.001, d = 0.2, n = 500 Which study shows a more important finding?

Click to see answers

1. d = 8/20 = 0.4 (small to medium effect) The treatment improves outcomes by 0.4 standard deviations—a modest but potentially meaningful improvement depending on context.

2. Despite high statistical significance (p = 0.001), the effect is trivially small (d = 0.15). The large sample size made a tiny difference significant. This is likely not practically meaningful.

3. Using the formula or table, d = 0.4 needs approximately: 100 per group (between the values for d = 0.5 and d = 0.3)

4. Study A shows a more important finding.

  • Study A has a large effect (d = 0.8) despite smaller sample
  • Study B has only a small effect (d = 0.2) that’s only significant due to large n
  • The large effect in Study A suggests practical importance
  • Study B’s tiny effect may not be worth pursuing despite significance

Next Steps

Complete your statistical education:

Advertisement

Was this lesson helpful?

Help us improve by sharing your feedback or spreading the word.