Chi-Square Tests
Learn how to perform chi-square tests for independence and goodness of fit. Complete guide with formulas, examples, and step-by-step calculations.
On This Page
What are Chi-Square Tests?
Chi-square tests are used to analyze categorical data—data that falls into distinct categories rather than continuous measurements. Unlike t-tests or ANOVA (which work with means), chi-square tests work with frequencies and counts.
Common applications include:
- Testing if observed data matches expected proportions
- Determining if two categorical variables are independent
- Analyzing survey responses
- Goodness-of-fit for probability distributions
Types of Chi-Square Tests
1. Chi-Square Goodness of Fit Test
Tests whether observed frequencies match expected frequencies for a single categorical variable.
Example: Do die rolls show equal frequencies for each number?
2. Chi-Square Test of Independence
Tests whether two categorical variables are independent or related.
Example: Is there a relationship between gender and political preference?
The Chi-Square Statistic
Both tests use the same basic formula:
Where:
- = Observed frequency (actual count)
- = Expected frequency (theoretical count)
- = Sum over all categories or cells
How It Works:
- Calculate expected frequencies under the null hypothesis
- Compare observed to expected for each category/cell
- Larger differences → larger chi-square statistic
- Larger chi-square → more evidence against null hypothesis
Chi-Square Goodness of Fit Test
Purpose
Test whether a single categorical variable follows a specific distribution.
Hypotheses
- : The variable follows the specified distribution
- : The variable does NOT follow the specified distribution
Steps
1. State hypotheses
2. Calculate expected frequencies
- Where = total sample size, = expected proportion for category
3. Compute chi-square statistic
4. Find degrees of freedom Where = number of categories
5. Compare to critical value or find p-value
A die is rolled 60 times. Are the results consistent with a fair die?
| Number | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|
| Observed | 8 | 12 | 10 | 9 | 11 | 10 |
Step 1: Hypotheses
- : Die is fair (each number has probability 1/6)
- : Die is not fair
Step 2: Expected frequencies
- For a fair die: for each number
Step 3: Calculate chi-square
| Number | O | E | O - E | ||
|---|---|---|---|---|---|
| 1 | 8 | 10 | -2 | 4 | 0.4 |
| 2 | 12 | 10 | 2 | 4 | 0.4 |
| 3 | 10 | 10 | 0 | 0 | 0.0 |
| 4 | 9 | 10 | -1 | 1 | 0.1 |
| 5 | 11 | 10 | 1 | 1 | 0.1 |
| 6 | 10 | 10 | 0 | 0 | 0.0 |
| Total | 1.0 |
Step 4: Degrees of freedom
Step 5: Critical value
- At with : critical value = 11.07
- Since , fail to reject
Conclusion: The die appears to be fair ().
Chi-Square Test of Independence
Purpose
Test whether two categorical variables are independent or associated.
Hypotheses
- : The two variables are independent (no relationship)
- : The two variables are dependent (relationship exists)
Contingency Table
Data is organized in a contingency table (also called a cross-tabulation):
| Variable B: Category 1 | Category 2 | Category 3 | Row Total | |
|---|---|---|---|---|
| Variable A: Category 1 | ||||
| Category 2 | ||||
| Column Total |
Expected Frequencies
Under independence:
Or more formally:
Degrees of Freedom
Where:
- = number of rows
- = number of columns
Is there a relationship between smoking status and exercise habits?
Observed Data:
| Never Exercise | Sometimes | Regularly | Total | |
|---|---|---|---|---|
| Smoker | 20 | 30 | 10 | 60 |
| Non-smoker | 15 | 45 | 80 | 140 |
| Total | 35 | 75 | 90 | 200 |
Step 1: Hypotheses
- : Smoking and exercise are independent
- : Smoking and exercise are related
Step 2: Calculate expected frequencies
For Smoker & Never:
For Smoker & Sometimes:
For Smoker & Regularly:
Expected Frequencies Table:
| Never | Sometimes | Regularly | |
|---|---|---|---|
| Smoker | 10.5 | 22.5 | 27.0 |
| Non-smoker | 24.5 | 52.5 | 63.0 |
Step 3: Calculate chi-square
Step 4: Degrees of freedom
Step 5: Decision
- Critical value at , : 5.99
- Since , reject
Conclusion: There is a statistically significant relationship between smoking and exercise habits (). Smokers are less likely to exercise regularly.
Assumptions and Conditions
For valid chi-square tests:
1. Independence
- Observations must be independent
- Each individual contributes to only one cell
- Random sampling is ideal
2. Expected Frequencies
- Rule of thumb: All expected frequencies should be ≥ 5
- If violated, consider:
- Combining categories
- Using Fisher’s exact test (for 2×2 tables)
- Collecting more data
3. Sample Size
- Larger samples are better
- Small samples may produce unreliable results
Interpreting Results
Statistical Significance
- Small p-value (< 0.05): Reject null hypothesis
- Large p-value (≥ 0.05): Fail to reject null hypothesis
Effect Size: Cramér’s V
For test of independence, measure effect size with Cramér’s V:
Interpretation:
- 0.00 - 0.10: Negligible association
- 0.10 - 0.30: Weak association
- 0.30 - 0.50: Moderate association
- 0.50+: Strong association
From the previous example:
Interpretation: Moderate association between smoking and exercise.
Examining Residuals
To understand which cells contribute most to chi-square:
- Values > 2 or < -2 indicate cells that deviate substantially from expectation
- Help identify patterns in the data
Goodness of Fit: Chi-Square Distribution
The chi-square test statistic follows a chi-square distribution with appropriate degrees of freedom.
Properties:
- Always non-negative (≥ 0)
- Skewed right
- Approaches normal distribution as df increases
- Mean = df
- Variance = 2×df
Common Mistakes
1. Using Percentages Instead of Counts
- Chi-square requires actual counts, not percentages
- If you only have percentages, convert back to counts
2. Ignoring Expected Frequency Requirement
- Don’t use chi-square if expected frequencies are too small
3. Confusing Independence with Causation
- Rejecting independence means variables are associated
- Doesn’t prove one causes the other
4. Using Wrong Degrees of Freedom
- Goodness of fit:
- Independence:
Practical Applications
Market Research
- Customer preferences across demographics
- Brand loyalty studies
Medicine
- Disease rates across groups
- Treatment outcomes by category
Social Sciences
- Survey response patterns
- Voting behavior analysis
Quality Control
- Defect rates across production lines
- Product category distributions
Chi-Square vs. Other Tests
| Comparison | Chi-Square | Alternative |
|---|---|---|
| Categorical vs. Continuous | Categorical data | t-test or ANOVA for continuous |
| Independence | Non-parametric | May need correlation for continuous |
| Small samples | Fisher’s exact test | When expected frequencies < 5 |
| Ordinal data | Can use, but… | Consider Mann-Whitney or Kruskal-Wallis |
Summary
In this lesson, you learned:
- Chi-square tests analyze categorical data using frequencies
- Goodness of fit tests if data matches a theoretical distribution
- Test of independence checks relationships between two categorical variables
- Formula:
- Expected frequencies should be ≥ 5 in all cells
- Degrees of freedom: for goodness of fit, for independence
- Cramér’s V measures effect size for test of independence
Frequently Asked Questions
What is a chi-square test used for?
The chi-square test is used to determine if there’s a statistically significant relationship between two categorical variables (test of independence) or if observed data matches expected frequencies (goodness of fit test). It’s commonly used in survey analysis, medical research, and quality control.
What is the chi-square formula?
The chi-square formula is: χ² = Σ[(O - E)² / E], where O is the observed frequency and E is the expected frequency. You calculate this for each cell and sum the results.
How do I calculate degrees of freedom for chi-square?
For goodness-of-fit test: df = k - 1, where k is the number of categories. For test of independence: df = (r - 1) × (c - 1), where r is rows and c is columns. For a 2×2 table, df = 1.
What does a significant chi-square result mean?
A significant chi-square result (p < 0.05) means the observed frequencies differ significantly from expected frequencies. For a test of independence, it means the two variables are associated (not independent).
When should I use chi-square vs t-test?
Use chi-square for categorical data (counts/frequencies). Use t-test for continuous numerical data (means). For example, use chi-square to test if there’s a relationship between gender and voting preference; use t-test to compare average test scores between groups.
What if expected frequency is less than 5?
If more than 20% of cells have expected frequency < 5, the chi-square test may be invalid. Options: (1) Combine categories, (2) Use Fisher’s exact test for 2×2 tables, (3) Collect more data.
What is the chi-square critical value at α = 0.05?
Common critical values at α = 0.05: df=1 → 3.841, df=2 → 5.991, df=3 → 7.815, df=4 → 9.488, df=5 → 11.070. Use the chi-square table for all values.
Can chi-square tell you which groups differ?
No, a significant chi-square only tells you that an association exists. To identify which cells differ from expected, examine the standardized residuals—values greater than ±2 indicate notable deviations.
Next Steps
Continue learning about categorical data analysis:
- Chi-Square Table - Look up critical values
- Odds Ratios - Another way to analyze categorical data
- Logistic Regression - Model categorical outcomes
Was this lesson helpful?
Help us improve by sharing your feedback or spreading the word.