The Normal Distribution
Understand the most important distribution in statistics. Learn the properties of the bell curve, its parameters, and why it appears everywhere.
On This Page
Introduction
The normal distribution (also called the Gaussian distribution or bell curve) is the most important probability distribution in statistics. It describes countless natural phenomena and forms the foundation of statistical inference.
What is a Normal Distribution?
A normal distribution is a continuous probability distribution that is:
- Perfectly symmetric around its center
- Bell-shaped (highest in middle, tapering at both ends)
- Defined by exactly two parameters: mean (μ) and standard deviation (σ)
Read as: “X follows a normal distribution with mean μ and standard deviation σ”
Properties of Normal Distributions
1. Symmetry
The curve is perfectly symmetric around the mean. The left half is a mirror image of the right half.
Implications:
- Mean = Median = Mode
- 50% of data falls below the mean, 50% above
- Skewness = 0
2. The Mean Determines Location
Changing μ shifts the entire curve left or right without changing its shape.
Three normal distributions with σ = 10:
- N(50, 10): centered at 50
- N(80, 10): centered at 80
- N(100, 10): centered at 100
All three curves have identical shapes—just different locations on the x-axis.
3. The Standard Deviation Determines Spread
Changing σ makes the curve wider (larger σ) or narrower (smaller σ).
Three normal distributions centered at 100:
- N(100, 5): narrow, tall peak
- N(100, 15): medium spread
- N(100, 30): wide, flat peak
Smaller σ means data is concentrated near the mean. Larger σ means data is more spread out.
4. Asymptotic Tails
The tails of the curve approach but never touch the x-axis. Theoretically, any value from negative infinity to positive infinity is possible.
5. Area Under the Curve = 1
The total area under any normal curve equals 1 (or 100%). This is what makes it a valid probability distribution.
The Probability Density Function
The Empirical Rule (68-95-99.7)
For any normal distribution:
- 68% of data falls within 1 standard deviation of the mean
- 95% of data falls within 2 standard deviations of the mean
- 99.7% of data falls within 3 standard deviations of the mean
| Range | Percentage |
|---|---|
| μ ± 1σ | 68.27% |
| μ ± 2σ | 95.45% |
| μ ± 3σ | 99.73% |
Adult male heights follow N(70, 3) — mean 70 inches, SD 3 inches.
68% of men are between:
- 70 - 3 = 67 inches and 70 + 3 = 73 inches
95% of men are between:
- 70 - 6 = 64 inches and 70 + 6 = 76 inches
99.7% of men are between:
- 70 - 9 = 61 inches and 70 + 9 = 79 inches
Only about 0.3% of men are shorter than 61” or taller than 79”!
The Standard Normal Distribution
The standard normal distribution is a special normal distribution with:
- Mean μ = 0
- Standard deviation σ = 1
Any normal distribution can be converted to the standard normal using z-scores:
This standardization is crucial because:
- Z-tables only work for N(0, 1)
- It allows comparison across different distributions
- It simplifies probability calculations
IQ scores: X ~ N(100, 15)
What’s the probability someone has IQ above 130?
Step 1: Convert to z-score
Step 2: Find P(Z > 2.0)
- From z-table: P(Z < 2.0) = 0.9772
- P(Z > 2.0) = 1 - 0.9772 = 0.0228 or 2.28%
About 2.3% of the population has an IQ above 130.
Calculating Normal Probabilities
Three Types of Problems
Battery life: X ~ N(500, 25) hours
P(battery lasts less than 450 hours)?
P(Z < -2.0) = 0.0228 or 2.28%
Using same battery example:
P(battery lasts more than 550 hours)?
P(Z > 2.0) = 1 - P(Z < 2.0) = 1 - 0.9772 = 0.0228 or 2.28%
P(battery lasts between 480 and 530 hours)?
P(-0.8 < Z < 1.2) = P(Z < 1.2) - P(Z < -0.8) = 0.8849 - 0.2119 = 0.6730 or 67.3%
Finding Values from Probabilities (Inverse)
Sometimes we need to find what value corresponds to a given probability.
Test scores: X ~ N(500, 100)
What score is needed to be in the top 10%?
Step 1: Find z for 90th percentile
- P(Z < z) = 0.90
- From z-table: z ≈ 1.28
Step 2: Convert back to original scale
A score of 628 or higher puts you in the top 10%.
Common Z-Values to Know
| Confidence Level | Two-Tailed Z | One-Tailed Z |
|---|---|---|
| 90% | ±1.645 | 1.28 |
| 95% | ±1.96 | 1.645 |
| 99% | ±2.576 | 2.33 |
These values appear constantly in:
- Confidence intervals
- Hypothesis testing
- Quality control
Why is the Normal Distribution Everywhere?
The Central Limit Theorem (Preview)
The Central Limit Theorem explains why normal distributions are so common:
When you add up (or average) many independent random variables, the result tends toward a normal distribution—regardless of the original distributions!
Natural Phenomena
Many biological and physical traits result from the combined effect of many small factors:
- Height: Affected by hundreds of genes plus nutrition, exercise, etc.
- Blood pressure: Influenced by genetics, diet, stress, age, etc.
- Measurement errors: Small random errors combine to create normal distribution
Checking for Normality
Before using normal distribution methods, verify your data is approximately normal.
Visual Methods
- Histogram: Should show bell shape
- Q-Q Plot: Points should fall along diagonal line
- Box Plot: Should be roughly symmetric
Numerical Methods
- Skewness: Should be close to 0
- Kurtosis: Should be close to 3 (or 0 for excess kurtosis)
- Shapiro-Wilk Test: Statistical test for normality
When Normal Distribution Doesn’t Apply
Not all data is normal. Watch out for:
| Data Type | Typical Distribution |
|---|---|
| Income, wealth | Right-skewed |
| Wait times | Exponential |
| Count data | Poisson |
| Proportions | Binomial |
| Time to failure | Weibull |
Summary
In this lesson, you learned:
- The normal distribution is defined by mean (μ) and standard deviation (σ)
- It’s symmetric and bell-shaped with mean = median = mode
- The empirical rule: 68% within 1 SD, 95% within 2 SD, 99.7% within 3 SD
- The standard normal N(0,1) is used for all probability calculations
- Standardize with z = (x - μ)/σ, convert back with x = μ + zσ
- The Central Limit Theorem explains why normal distributions appear everywhere
- Always check normality before applying normal distribution methods
Practice Problems
1. Heights of adult women follow N(64, 2.5). Find: a) P(height < 60 inches) b) P(height > 67 inches) c) P(62 < height < 68 inches)
2. SAT scores follow N(1050, 200). What score is needed to be: a) In the top 5%? b) At the 75th percentile?
3. A machine fills bottles with mean 500 mL and SD 5 mL. Bottles with less than 490 mL are rejected. What percentage is rejected?
4. If exam scores have mean 75 and SD 10, and grades are assigned with A to top 10%, what’s the minimum score for an A?
Click to see answers
1. a) z = (60-64)/2.5 = -1.6; P(Z < -1.6) = 0.0548 (5.48%) b) z = (67-64)/2.5 = 1.2; P(Z > 1.2) = 1 - 0.8849 = 0.1151 (11.51%) c) z₁ = -0.8, z₂ = 1.6; P = 0.9452 - 0.2119 = 0.7333 (73.33%)
2. a) 95th percentile: z = 1.645; X = 1050 + 1.645(200) = 1379 b) 75th percentile: z = 0.674; X = 1050 + 0.674(200) = 1185
3. z = (490-500)/5 = -2.0; P(Z < -2) = 0.0228; 2.28% rejected
4. 90th percentile: z = 1.28; X = 75 + 1.28(10) = 87.8 (round to 88)
Next Steps
With your understanding of the normal distribution:
- Sampling Distributions - How sample means are distributed
- Confidence Intervals - Estimating population parameters
- Z-Score Calculator - Practice normal probability calculations
Was this lesson helpful?
Help us improve by sharing your feedback or spreading the word.