Sample Size Determination
Learn to calculate required sample size. Understand margin of error, confidence level, power analysis, and planning studies.
On This Page
Why Sample Size Matters
Key Factors
| Factor | Effect on Required n |
|---|---|
| Smaller margin of error | Larger n needed |
| Higher confidence level | Larger n needed |
| Higher power | Larger n needed |
| Larger effect size | Smaller n needed |
| Greater variability (σ) | Larger n needed |
Sample Size for Estimating a Mean
Where:
- z* = critical value for confidence level
- σ = population standard deviation (or estimate)
- E = desired margin of error
Goal: Estimate mean household income within $2,000 (margin of error)
Given:
- 95% confidence → z* = 1.96
- Estimated σ = $15,000
- E = $2,000
Solution:
Required: n = 217 (always round UP)
Sample Size for Estimating a Proportion
If p is unknown, use p = 0.5 (conservative estimate)
Goal: Estimate voter support within ±3 percentage points
Given:
- 95% confidence → z* = 1.96
- E = 0.03
- p unknown → use 0.5
Solution:
Required: n = 1068 (round UP)
That’s why polls often survey about 1000 people!
If a previous poll showed 60% support:
Required: n = 1025
Slightly smaller because p(1-p) is smaller when p ≠ 0.5.
Sample Size for Hypothesis Testing (Power Analysis)
For hypothesis tests, sample size depends on:
- Significance level (α): Typically 0.05
- Power (1-β): Often 0.80 or 0.90
- Effect size: How large a difference matters
- Variability: Population standard deviation
Where:
- zα/2 = critical value for significance level
- zβ = critical value for power (e.g., 0.84 for 80% power)
- σ = standard deviation
- δ = minimum detectable difference
Goal: Detect a 5-point improvement in blood pressure
Given:
- α = 0.05 (two-tailed) → zα/2 = 1.96
- Power = 80% → zβ = 0.84
- σ = 12 mmHg
- δ = 5 mmHg
Solution:
Required: n = 91 per group (182 total)
Effect Size
Effect size standardizes the difference you want to detect.
| Effect Size (d) | Interpretation |
|---|---|
| 0.2 | Small |
| 0.5 | Medium |
| 0.8 | Large |
To detect a medium effect (d = 0.5) with 80% power at α = 0.05:
n per group ≈ 64
For small effect (d = 0.2): n per group ≈ 393
For large effect (d = 0.8): n per group ≈ 26
Common Sample Size Tables
For Estimating Proportions (95% CI)
| Margin of Error | n (p = 0.5) |
|---|---|
| ±10% | 96 |
| ±5% | 385 |
| ±3% | 1068 |
| ±2% | 2401 |
| ±1% | 9604 |
For Comparing Two Means (80% power, α = 0.05)
| Effect Size | n per Group |
|---|---|
| Small (0.2) | 393 |
| Medium (0.5) | 64 |
| Large (0.8) | 26 |
Practical Considerations
Adjusting for Nonresponse
Where r = expected response rate (as decimal)
Calculated n = 400 Expected response rate = 60%
Need to initially contact 667 people to get ~400 responses.
Finite Population Correction
If sampling a significant fraction of the population, you need fewer observations:
Where:
- n₀ = calculated sample size
- N = population size
Calculated n₀ = 400, Population N = 2000
Need only 333 instead of 400.
Note: If N is very large, this correction is negligible.
Software for Sample Size
Most researchers use software:
- G*Power (free, comprehensive)
- R (packages: pwr, samplesize)
- Stata (power command)
- Online calculators
Summary
In this lesson, you learned:
- Sample size for means: n = (z*σ/E)²
- Sample size for proportions: n = p(1-p)(z*/E)²
- Use p = 0.5 when proportion is unknown (conservative)
- Power analysis balances α, power, effect size, and variability
- Effect size standardizes the difference you want to detect
- Adjust for nonresponse and finite populations
- Always round UP to ensure sufficient precision
Practice Problems
1. You want to estimate mean height within 1 inch with 95% confidence. SD is estimated at 3 inches. What sample size is needed?
2. A poll wants margin of error of ±4% at 95% confidence. How many people should be surveyed?
3. A researcher expects 70% response rate and needs 300 responses. How many should be initially contacted?
4. Why might a researcher use p = 0.5 even when they expect the true proportion to be about 0.3?
Click to see answers
1.
Required: n = 35
2.
Required: n = 601
3.
Initially contact: 429 people
4. Two reasons:
a) Conservative estimate: p = 0.5 maximizes p(1-p), giving the largest sample size. This ensures the margin of error will be at most what was specified.
b) Uncertainty: The expected p = 0.3 is just an estimate. If the true value is different (closer to 0.5), the sample size based on p = 0.3 might be insufficient.
Using p = 0.5 protects against underestimating the needed sample size.
Next Steps
Apply your knowledge to research design:
- Hypothesis Testing Basics - Testing claims
- Effect Size and Power - Statistical power
- Confidence Intervals - Interval estimation
Was this lesson helpful?
Help us improve by sharing your feedback or spreading the word.