intermediate 25 minutes

Chi-Square Tests

Learn how to perform chi-square tests for independence and goodness of fit. Complete guide with formulas, examples, and step-by-step calculations.

On This Page
Advertisement

What are Chi-Square Tests?

Chi-square tests are used to analyze categorical data—data that falls into distinct categories rather than continuous measurements. Unlike t-tests or ANOVA (which work with means), chi-square tests work with frequencies and counts.

Common applications include:

  • Testing if observed data matches expected proportions
  • Determining if two categorical variables are independent
  • Analyzing survey responses
  • Goodness-of-fit for probability distributions

Types of Chi-Square Tests

1. Chi-Square Goodness of Fit Test

Tests whether observed frequencies match expected frequencies for a single categorical variable.

Example: Do die rolls show equal frequencies for each number?

2. Chi-Square Test of Independence

Tests whether two categorical variables are independent or related.

Example: Is there a relationship between gender and political preference?

The Chi-Square Statistic

Both tests use the same basic formula:

Chi-Square Statistic

χ2=(OE)2E\chi^2 = \sum \frac{(O - E)^2}{E}

Where:

  • OO = Observed frequency (actual count)
  • EE = Expected frequency (theoretical count)
  • \sum = Sum over all categories or cells

How It Works:

  1. Calculate expected frequencies under the null hypothesis
  2. Compare observed to expected for each category/cell
  3. Larger differences → larger chi-square statistic
  4. Larger chi-square → more evidence against null hypothesis

Chi-Square Goodness of Fit Test

Purpose

Test whether a single categorical variable follows a specific distribution.

Hypotheses

  • H0H_0: The variable follows the specified distribution
  • HAH_A: The variable does NOT follow the specified distribution

Steps

1. State hypotheses

2. Calculate expected frequencies

  • Ei=n×piE_i = n \times p_i
  • Where nn = total sample size, pip_i = expected proportion for category ii

3. Compute chi-square statistic χ2=(OiEi)2Ei\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}

4. Find degrees of freedom df=k1df = k - 1 Where kk = number of categories

5. Compare to critical value or find p-value

Goodness of Fit: Fair Die Test

A die is rolled 60 times. Are the results consistent with a fair die?

Number123456
Observed8121091110

Step 1: Hypotheses

  • H0H_0: Die is fair (each number has probability 1/6)
  • HAH_A: Die is not fair

Step 2: Expected frequencies

  • For a fair die: E=60×16=10E = 60 \times \frac{1}{6} = 10 for each number

Step 3: Calculate chi-square

NumberOEO - E(OE)2(O-E)^2(OE)2/E(O-E)^2/E
1810-240.4
21210240.4
31010000.0
4910-110.1
51110110.1
61010000.0
Total1.0

χ2=1.0\chi^2 = 1.0

Step 4: Degrees of freedom df=61=5df = 6 - 1 = 5

Step 5: Critical value

  • At α=0.05\alpha = 0.05 with df=5df = 5: critical value = 11.07
  • Since 1.0<11.071.0 < 11.07, fail to reject H0H_0

Conclusion: The die appears to be fair (p>0.05p > 0.05).

Chi-Square Test of Independence

Purpose

Test whether two categorical variables are independent or associated.

Hypotheses

  • H0H_0: The two variables are independent (no relationship)
  • HAH_A: The two variables are dependent (relationship exists)

Contingency Table

Data is organized in a contingency table (also called a cross-tabulation):

Variable B: Category 1Category 2Category 3Row Total
Variable A: Category 1O11O_{11}O12O_{12}O13O_{13}R1R_1
Category 2O21O_{21}O22O_{22}O23O_{23}R2R_2
Column TotalC1C_1C2C_2C3C_3nn

Expected Frequencies

Under independence:

Expected Frequency

Eij=(Row Total)i×(Column Total)jGrand TotalE_{ij} = \frac{(\text{Row Total})_i \times (\text{Column Total})_j}{\text{Grand Total}}

Or more formally:

Eij=Ri×CjnE_{ij} = \frac{R_i \times C_j}{n}

Degrees of Freedom

df for Test of Independence

df=(r1)(c1)df = (r - 1)(c - 1)

Where:

  • rr = number of rows
  • cc = number of columns
Test of Independence: Smoking and Exercise

Is there a relationship between smoking status and exercise habits?

Observed Data:

Never ExerciseSometimesRegularlyTotal
Smoker20301060
Non-smoker154580140
Total357590200

Step 1: Hypotheses

  • H0H_0: Smoking and exercise are independent
  • HAH_A: Smoking and exercise are related

Step 2: Calculate expected frequencies

For Smoker & Never: E=60×35200=10.5E = \frac{60 \times 35}{200} = 10.5

For Smoker & Sometimes: E=60×75200=22.5E = \frac{60 \times 75}{200} = 22.5

For Smoker & Regularly: E=60×90200=27E = \frac{60 \times 90}{200} = 27

Expected Frequencies Table:

NeverSometimesRegularly
Smoker10.522.527.0
Non-smoker24.552.563.0

Step 3: Calculate chi-square

χ2=(2010.5)210.5+(3022.5)222.5+(1027)227\chi^2 = \frac{(20-10.5)^2}{10.5} + \frac{(30-22.5)^2}{22.5} + \frac{(10-27)^2}{27} +(1524.5)224.5+(4552.5)252.5+(8063)263+ \frac{(15-24.5)^2}{24.5} + \frac{(45-52.5)^2}{52.5} + \frac{(80-63)^2}{63}

=8.60+2.50+10.70+3.68+1.07+4.59=31.14= 8.60 + 2.50 + 10.70 + 3.68 + 1.07 + 4.59 = 31.14

Step 4: Degrees of freedom df=(21)(31)=2df = (2-1)(3-1) = 2

Step 5: Decision

  • Critical value at α=0.05\alpha = 0.05, df=2df = 2: 5.99
  • Since 31.14>5.9931.14 > 5.99, reject H0H_0

Conclusion: There is a statistically significant relationship between smoking and exercise habits (p<0.001p < 0.001). Smokers are less likely to exercise regularly.

Assumptions and Conditions

For valid chi-square tests:

1. Independence

  • Observations must be independent
  • Each individual contributes to only one cell
  • Random sampling is ideal

2. Expected Frequencies

  • Rule of thumb: All expected frequencies should be ≥ 5
  • If violated, consider:
    • Combining categories
    • Using Fisher’s exact test (for 2×2 tables)
    • Collecting more data

3. Sample Size

  • Larger samples are better
  • Small samples may produce unreliable results

Interpreting Results

Statistical Significance

  • Small p-value (< 0.05): Reject null hypothesis
  • Large p-value (≥ 0.05): Fail to reject null hypothesis

Effect Size: Cramér’s V

For test of independence, measure effect size with Cramér’s V:

Cramér's V

V=χ2n×min(r1,c1)V = \sqrt{\frac{\chi^2}{n \times \min(r-1, c-1)}}

Interpretation:

  • 0.00 - 0.10: Negligible association
  • 0.10 - 0.30: Weak association
  • 0.30 - 0.50: Moderate association
  • 0.50+: Strong association
Calculating Cramér's V

From the previous example:

  • χ2=31.14\chi^2 = 31.14
  • n=200n = 200
  • min(r1,c1)=min(1,2)=1\min(r-1, c-1) = \min(1, 2) = 1

V=31.14200×1=0.1557=0.395V = \sqrt{\frac{31.14}{200 \times 1}} = \sqrt{0.1557} = 0.395

Interpretation: Moderate association between smoking and exercise.

Examining Residuals

To understand which cells contribute most to chi-square:

Standardized Residual

rij=OijEijEijr_{ij} = \frac{O_{ij} - E_{ij}}{\sqrt{E_{ij}}}

  • Values > 2 or < -2 indicate cells that deviate substantially from expectation
  • Help identify patterns in the data

Goodness of Fit: Chi-Square Distribution

The chi-square test statistic follows a chi-square distribution with appropriate degrees of freedom.

Properties:

  • Always non-negative (≥ 0)
  • Skewed right
  • Approaches normal distribution as df increases
  • Mean = df
  • Variance = 2×df

Common Mistakes

1. Using Percentages Instead of Counts

  • Chi-square requires actual counts, not percentages
  • If you only have percentages, convert back to counts

2. Ignoring Expected Frequency Requirement

  • Don’t use chi-square if expected frequencies are too small

3. Confusing Independence with Causation

  • Rejecting independence means variables are associated
  • Doesn’t prove one causes the other

4. Using Wrong Degrees of Freedom

  • Goodness of fit: df=k1df = k - 1
  • Independence: df=(r1)(c1)df = (r-1)(c-1)

Practical Applications

Market Research

  • Customer preferences across demographics
  • Brand loyalty studies

Medicine

  • Disease rates across groups
  • Treatment outcomes by category

Social Sciences

  • Survey response patterns
  • Voting behavior analysis

Quality Control

  • Defect rates across production lines
  • Product category distributions

Chi-Square vs. Other Tests

ComparisonChi-SquareAlternative
Categorical vs. ContinuousCategorical datat-test or ANOVA for continuous
IndependenceNon-parametricMay need correlation for continuous
Small samplesFisher’s exact testWhen expected frequencies < 5
Ordinal dataCan use, but…Consider Mann-Whitney or Kruskal-Wallis

Summary

In this lesson, you learned:

  • Chi-square tests analyze categorical data using frequencies
  • Goodness of fit tests if data matches a theoretical distribution
  • Test of independence checks relationships between two categorical variables
  • Formula: χ2=(OE)2/E\chi^2 = \sum (O - E)^2 / E
  • Expected frequencies should be ≥ 5 in all cells
  • Degrees of freedom: k1k-1 for goodness of fit, (r1)(c1)(r-1)(c-1) for independence
  • Cramér’s V measures effect size for test of independence

Frequently Asked Questions

What is a chi-square test used for?

The chi-square test is used to determine if there’s a statistically significant relationship between two categorical variables (test of independence) or if observed data matches expected frequencies (goodness of fit test). It’s commonly used in survey analysis, medical research, and quality control.

What is the chi-square formula?

The chi-square formula is: χ² = Σ[(O - E)² / E], where O is the observed frequency and E is the expected frequency. You calculate this for each cell and sum the results.

How do I calculate degrees of freedom for chi-square?

For goodness-of-fit test: df = k - 1, where k is the number of categories. For test of independence: df = (r - 1) × (c - 1), where r is rows and c is columns. For a 2×2 table, df = 1.

What does a significant chi-square result mean?

A significant chi-square result (p < 0.05) means the observed frequencies differ significantly from expected frequencies. For a test of independence, it means the two variables are associated (not independent).

When should I use chi-square vs t-test?

Use chi-square for categorical data (counts/frequencies). Use t-test for continuous numerical data (means). For example, use chi-square to test if there’s a relationship between gender and voting preference; use t-test to compare average test scores between groups.

What if expected frequency is less than 5?

If more than 20% of cells have expected frequency < 5, the chi-square test may be invalid. Options: (1) Combine categories, (2) Use Fisher’s exact test for 2×2 tables, (3) Collect more data.

What is the chi-square critical value at α = 0.05?

Common critical values at α = 0.05: df=1 → 3.841, df=2 → 5.991, df=3 → 7.815, df=4 → 9.488, df=5 → 11.070. Use the chi-square table for all values.

Can chi-square tell you which groups differ?

No, a significant chi-square only tells you that an association exists. To identify which cells differ from expected, examine the standardized residuals—values greater than ±2 indicate notable deviations.


Next Steps

Continue learning about categorical data analysis:

Advertisement

Was this lesson helpful?

Help us improve by sharing your feedback or spreading the word.