intermediate 25 minutes

The Normal Distribution

Understand the most important distribution in statistics. Learn the properties of the bell curve, its parameters, and why it appears everywhere.

On This Page
Advertisement

Introduction

The normal distribution (also called the Gaussian distribution or bell curve) is the most important probability distribution in statistics. It describes countless natural phenomena and forms the foundation of statistical inference.

What is a Normal Distribution?

A normal distribution is a continuous probability distribution that is:

  • Perfectly symmetric around its center
  • Bell-shaped (highest in middle, tapering at both ends)
  • Defined by exactly two parameters: mean (μ) and standard deviation (σ)
Normal Distribution Notation

XN(μ,σ)X \sim N(\mu, \sigma)

Read as: “X follows a normal distribution with mean μ and standard deviation σ”

Properties of Normal Distributions

1. Symmetry

The curve is perfectly symmetric around the mean. The left half is a mirror image of the right half.

Implications:

  • Mean = Median = Mode
  • 50% of data falls below the mean, 50% above
  • Skewness = 0

2. The Mean Determines Location

Changing μ shifts the entire curve left or right without changing its shape.

Different Means, Same Spread

Three normal distributions with σ = 10:

  • N(50, 10): centered at 50
  • N(80, 10): centered at 80
  • N(100, 10): centered at 100

All three curves have identical shapes—just different locations on the x-axis.

3. The Standard Deviation Determines Spread

Changing σ makes the curve wider (larger σ) or narrower (smaller σ).

Same Mean, Different Spreads

Three normal distributions centered at 100:

  • N(100, 5): narrow, tall peak
  • N(100, 15): medium spread
  • N(100, 30): wide, flat peak

Smaller σ means data is concentrated near the mean. Larger σ means data is more spread out.

4. Asymptotic Tails

The tails of the curve approach but never touch the x-axis. Theoretically, any value from negative infinity to positive infinity is possible.

5. Area Under the Curve = 1

The total area under any normal curve equals 1 (or 100%). This is what makes it a valid probability distribution.

The Probability Density Function

Normal Distribution PDF

f(x)=1σ2πe12(xμσ)2f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}

The Empirical Rule (68-95-99.7)

For any normal distribution:

The Empirical Rule
  • 68% of data falls within 1 standard deviation of the mean
  • 95% of data falls within 2 standard deviations of the mean
  • 99.7% of data falls within 3 standard deviations of the mean
RangePercentage
μ ± 1σ68.27%
μ ± 2σ95.45%
μ ± 3σ99.73%
Applying the Empirical Rule

Adult male heights follow N(70, 3) — mean 70 inches, SD 3 inches.

68% of men are between:

  • 70 - 3 = 67 inches and 70 + 3 = 73 inches

95% of men are between:

  • 70 - 6 = 64 inches and 70 + 6 = 76 inches

99.7% of men are between:

  • 70 - 9 = 61 inches and 70 + 9 = 79 inches

Only about 0.3% of men are shorter than 61” or taller than 79”!

The Standard Normal Distribution

The standard normal distribution is a special normal distribution with:

  • Mean μ = 0
  • Standard deviation σ = 1
Standard Normal Distribution

ZN(0,1)Z \sim N(0, 1)

Any normal distribution can be converted to the standard normal using z-scores:

Z=XμσZ = \frac{X - \mu}{\sigma}

This standardization is crucial because:

  1. Z-tables only work for N(0, 1)
  2. It allows comparison across different distributions
  3. It simplifies probability calculations
Standardization in Action

IQ scores: X ~ N(100, 15)

What’s the probability someone has IQ above 130?

Step 1: Convert to z-score z=13010015=2.0z = \frac{130 - 100}{15} = 2.0

Step 2: Find P(Z > 2.0)

  • From z-table: P(Z < 2.0) = 0.9772
  • P(Z > 2.0) = 1 - 0.9772 = 0.0228 or 2.28%

About 2.3% of the population has an IQ above 130.

Calculating Normal Probabilities

Three Types of Problems

Type 1: Less Than — P(X < a)

Battery life: X ~ N(500, 25) hours

P(battery lasts less than 450 hours)?

z=45050025=2.0z = \frac{450 - 500}{25} = -2.0

P(Z < -2.0) = 0.0228 or 2.28%

Type 2: Greater Than — P(X > a)

Using same battery example:

P(battery lasts more than 550 hours)?

z=55050025=2.0z = \frac{550 - 500}{25} = 2.0

P(Z > 2.0) = 1 - P(Z < 2.0) = 1 - 0.9772 = 0.0228 or 2.28%

Type 3: Between — P(a < X < b)

P(battery lasts between 480 and 530 hours)?

z1=48050025=0.8z_1 = \frac{480 - 500}{25} = -0.8 z2=53050025=1.2z_2 = \frac{530 - 500}{25} = 1.2

P(-0.8 < Z < 1.2) = P(Z < 1.2) - P(Z < -0.8) = 0.8849 - 0.2119 = 0.6730 or 67.3%

Finding Values from Probabilities (Inverse)

Sometimes we need to find what value corresponds to a given probability.

Finding a Percentile

Test scores: X ~ N(500, 100)

What score is needed to be in the top 10%?

Step 1: Find z for 90th percentile

  • P(Z < z) = 0.90
  • From z-table: z ≈ 1.28

Step 2: Convert back to original scale X=μ+zσ=500+1.28(100)=628X = \mu + z \cdot \sigma = 500 + 1.28(100) = 628

A score of 628 or higher puts you in the top 10%.

Converting Z to X

X=μ+ZσX = \mu + Z \cdot \sigma

Common Z-Values to Know

Confidence LevelTwo-Tailed ZOne-Tailed Z
90%±1.6451.28
95%±1.961.645
99%±2.5762.33

These values appear constantly in:

  • Confidence intervals
  • Hypothesis testing
  • Quality control

Why is the Normal Distribution Everywhere?

The Central Limit Theorem (Preview)

The Central Limit Theorem explains why normal distributions are so common:

When you add up (or average) many independent random variables, the result tends toward a normal distribution—regardless of the original distributions!

Natural Phenomena

Many biological and physical traits result from the combined effect of many small factors:

  • Height: Affected by hundreds of genes plus nutrition, exercise, etc.
  • Blood pressure: Influenced by genetics, diet, stress, age, etc.
  • Measurement errors: Small random errors combine to create normal distribution

Checking for Normality

Before using normal distribution methods, verify your data is approximately normal.

Visual Methods

  1. Histogram: Should show bell shape
  2. Q-Q Plot: Points should fall along diagonal line
  3. Box Plot: Should be roughly symmetric

Numerical Methods

  1. Skewness: Should be close to 0
  2. Kurtosis: Should be close to 3 (or 0 for excess kurtosis)
  3. Shapiro-Wilk Test: Statistical test for normality

When Normal Distribution Doesn’t Apply

Not all data is normal. Watch out for:

Data TypeTypical Distribution
Income, wealthRight-skewed
Wait timesExponential
Count dataPoisson
ProportionsBinomial
Time to failureWeibull

Summary

In this lesson, you learned:

  • The normal distribution is defined by mean (μ) and standard deviation (σ)
  • It’s symmetric and bell-shaped with mean = median = mode
  • The empirical rule: 68% within 1 SD, 95% within 2 SD, 99.7% within 3 SD
  • The standard normal N(0,1) is used for all probability calculations
  • Standardize with z = (x - μ)/σ, convert back with x = μ + zσ
  • The Central Limit Theorem explains why normal distributions appear everywhere
  • Always check normality before applying normal distribution methods

Practice Problems

1. Heights of adult women follow N(64, 2.5). Find: a) P(height < 60 inches) b) P(height > 67 inches) c) P(62 < height < 68 inches)

2. SAT scores follow N(1050, 200). What score is needed to be: a) In the top 5%? b) At the 75th percentile?

3. A machine fills bottles with mean 500 mL and SD 5 mL. Bottles with less than 490 mL are rejected. What percentage is rejected?

4. If exam scores have mean 75 and SD 10, and grades are assigned with A to top 10%, what’s the minimum score for an A?

Click to see answers

1. a) z = (60-64)/2.5 = -1.6; P(Z < -1.6) = 0.0548 (5.48%) b) z = (67-64)/2.5 = 1.2; P(Z > 1.2) = 1 - 0.8849 = 0.1151 (11.51%) c) z₁ = -0.8, z₂ = 1.6; P = 0.9452 - 0.2119 = 0.7333 (73.33%)

2. a) 95th percentile: z = 1.645; X = 1050 + 1.645(200) = 1379 b) 75th percentile: z = 0.674; X = 1050 + 0.674(200) = 1185

3. z = (490-500)/5 = -2.0; P(Z < -2) = 0.0228; 2.28% rejected

4. 90th percentile: z = 1.28; X = 75 + 1.28(10) = 87.8 (round to 88)

Next Steps

With your understanding of the normal distribution:

Advertisement

Was this lesson helpful?

Help us improve by sharing your feedback or spreading the word.