Bayes' Theorem
Master Bayes' theorem to update probabilities with new evidence. Essential for medical diagnosis, machine learning, and decision making.
On This Page
The Problem: Reversing Conditional Probability
Often we know P(B|A) but need P(A|B):
- We know P(test positive | have disease), but we need P(have disease | test positive)
- We know P(evidence | guilty), but we need P(guilty | evidence)
- We know P(data | hypothesis), but we need P(hypothesis | data)
Bayes’ theorem lets us reverse conditional probabilities.
Bayes’ Theorem
Understanding Each Term
| Term | Name | Meaning |
|---|---|---|
| P(A|B) | Posterior | Updated probability of A after observing B |
| P(A) | Prior | Initial probability of A before new evidence |
| P(B|A) | Likelihood | Probability of observing B if A is true |
| P(B) | Evidence | Total probability of observing B |
Expanded Form with Total Probability
Often P(B) isn’t directly known. We calculate it using the law of total probability:
Classic Example: Medical Testing
A rare disease affects 1% of the population.
Test characteristics:
- Sensitivity: P(+|Disease) = 99% (true positive rate)
- Specificity: P(-|No Disease) = 95% (true negative rate)
Question: If you test positive, what’s the probability you actually have the disease?
Define events:
- D = has disease, P(D) = 0.01
-
- = tests positive
What we know:
- P(+|D) = 0.99 (sensitivity)
- P(+|no D) = 1 - 0.95 = 0.05 (false positive rate)
Apply Bayes:
Result: Only about 17% of people who test positive actually have the disease!
Intuitive Approach: Natural Frequencies
Sometimes it’s easier to think in terms of actual numbers:
Consider 10,000 people:
With disease (1%): 100 people
- Test positive (99%): 99 people ✓
- Test negative (1%): 1 person
Without disease (99%): 9,900 people
- Test positive (5%): 495 people ✗
- Test negative (95%): 9,405 people
Total positive tests: 99 + 495 = 594
P(disease | positive): 99 / 594 ≈ 0.167 or 16.7%
Same answer, but easier to understand!
Sequential Testing
What if someone tests positive and takes another test?
After the first positive test, P(D) is now 0.167 (the posterior becomes the new prior).
They take a second independent test and it’s also positive.
After two positive tests, there’s an 80% chance of having the disease.
A third positive test would increase this to about 98%!
Multiple Hypotheses
Bayes’ theorem extends to multiple competing hypotheses:
Three machines produce widgets:
- Machine A: 50% of production, 2% defect rate
- Machine B: 30% of production, 3% defect rate
- Machine C: 20% of production, 5% defect rate
A defective widget is found. Which machine most likely produced it?
P(defective) = (0.50)(0.02) + (0.30)(0.03) + (0.20)(0.05) = 0.01 + 0.009 + 0.01 = 0.029
P(Machine A | defective):
P(Machine B | defective):
P(Machine C | defective):
Machine A and C are equally likely (34.5% each), despite C having a higher defect rate. This is because A produces more widgets overall!
Applications of Bayes’ Theorem
1. Medical Diagnosis
Interpreting test results, especially for rare conditions.
2. Spam Filtering
Naive Bayes classifiers calculate P(spam | words in email).
3. Legal Evidence
DNA matching, forensic evidence interpretation.
4. Machine Learning
Bayesian neural networks, probabilistic programming.
5. Quality Control
Identifying root causes of defects.
6. Weather Forecasting
Updating predictions as new data arrives.
Common Mistakes with Bayes
The Prosecutor’s Fallacy
A famous misuse of conditional probability:
DNA at crime scene matches defendant. P(match | innocent) = 1 in 1,000,000
Prosecutor claims: “There’s only a 1 in a million chance he’s innocent!”
This is WRONG! The prosecutor confused:
- P(match | innocent) = 0.000001
- P(innocent | match) = ???
To find P(guilty | match), we need:
- Prior P(guilty) - What fraction of population is a suspect?
- Total people who could match
If 300 million people could have committed the crime, about 300 would match. If only one is guilty, P(guilty | match) = 1/300 ≈ 0.3%, not 99.9999%!
Bayes Factor
The Bayes factor measures how much evidence supports one hypothesis over another:
| Bayes Factor | Evidence Strength |
|---|---|
| 1-3 | Barely worth mentioning |
| 3-20 | Positive |
| 20-150 | Strong |
| >150 | Very strong |
Summary
In this lesson, you learned:
- Bayes’ theorem reverses conditional probability: P(A|B) from P(B|A)
- The formula:
- Prior probability = initial belief before evidence
- Posterior probability = updated belief after evidence
- Base rate fallacy: Ignoring prior probabilities leads to errors
- Natural frequencies make Bayes easier to understand
- Sequential evidence leads to Bayesian updating
- Watch out for the prosecutor’s fallacy
Practice Problems
1. A disease has 2% prevalence. Test sensitivity is 95%, specificity is 90%. a) P(disease | positive)? b) P(no disease | negative)?
2. Factory A ships 60% of products, Factory B ships 40%. Defect rates are 3% (A) and 5% (B). If a product is defective, what’s P(from Factory A)?
3. 80% of emails are legitimate, 20% are spam. A spam filter has:
- P(flagged | spam) = 0.90
- P(flagged | legitimate) = 0.05
a) P(email is spam | flagged)? b) P(email is legitimate | not flagged)?
4. Using natural frequencies: In 10,000 people, if a disease affects 5% and a test has 90% sensitivity and 85% specificity, how many positive tests are false positives?
Click to see answers
1. a) P(D|+) = (0.95)(0.02) / [(0.95)(0.02) + (0.10)(0.98)] = 0.019 / (0.019 + 0.098) = 0.019 / 0.117 ≈ 0.162 (16.2%)
b) P(no D|-) = (0.90)(0.98) / [(0.90)(0.98) + (0.05)(0.02)] = 0.882 / (0.882 + 0.001) = 0.882 / 0.883 ≈ 0.999 (99.9%)
2. P(defective) = (0.60)(0.03) + (0.40)(0.05) = 0.018 + 0.020 = 0.038 P(A|defective) = (0.03)(0.60) / 0.038 = 0.018 / 0.038 ≈ 0.474 (47.4%)
3. a) P(flagged) = (0.90)(0.20) + (0.05)(0.80) = 0.18 + 0.04 = 0.22 P(spam|flagged) = (0.90)(0.20) / 0.22 = 0.18 / 0.22 ≈ 0.818 (81.8%)
b) P(not flagged) = 1 - 0.22 = 0.78 P(legitimate|not flagged) = (0.95)(0.80) / 0.78 = 0.76 / 0.78 ≈ 0.974 (97.4%)
4.
- Disease (5%): 500 people → 450 test positive (true +), 50 test negative
- No disease (95%): 9,500 people → 8,075 test negative, 1,425 test positive (false +)
- False positives: 1,425
- Of 1,875 total positives, 1,425/1,875 = 76% are false positives!
Next Steps
Continue building your probability knowledge:
- Discrete Distributions - Binomial and Poisson
- Continuous Distributions - Normal and beyond
- Probability Calculator - Practice Bayes calculations
Was this lesson helpful?
Help us improve by sharing your feedback or spreading the word.