advanced 30 minutes

Multiple Regression

Extend regression to multiple predictors. Learn to interpret coefficients, assess multicollinearity, and build predictive models.

On This Page
Advertisement

Introduction to Multiple Regression

Multiple regression extends simple regression to include multiple predictor variables.

Multiple Regression Model

ŷ = b₀ + b₁x₁ + b₂x₂ + … + bₖxₖ

Where:

  • ŷ = predicted value
  • b₀ = intercept
  • b₁, b₂, …, bₖ = regression coefficients
  • x₁, x₂, …, xₖ = predictor variables
Multiple Regression Example

Predicting house price:

Price = 50,000 + 100(Square Feet) + 15,000(Bedrooms) + 8,000(Bathrooms)

For a house with 2000 sq ft, 3 bedrooms, 2 bathrooms:

Price = 50,000 + 100(2000) + 15,000(3) + 8,000(2) = 50,000 + 200,000 + 45,000 + 16,000 = $311,000


Interpreting Coefficients

In multiple regression, each coefficient is a partial effect—the effect of that variable holding all other variables constant.

Partial Effect Interpretation

Model: Salary = 30,000 + 2,000(Experience) + 5,000(Education)

Coefficient for Experience (2,000): “Holding education constant, each additional year of experience is associated with $2,000 higher salary.”

Coefficient for Education (5,000): “Holding experience constant, each additional year of education is associated with $5,000 higher salary.”


Adjusted R-Squared

As you add more predictors, R² always increases (or stays the same). Adjusted R² penalizes for adding variables.

Adjusted R-Squared

R²_adj = 1 - [(1-R²)(n-1)] / (n-k-1)

Where:

  • n = sample size
  • k = number of predictors
ComparisonInterpretation
R²_adj increasesVariable improves model
R²_adj decreasesVariable doesn’t help enough
R²_adj much smaller than R²Possible overfitting

F-Test for Overall Model

F-Test

H₀: All slopes = 0 (model has no predictive value) H₁: At least one slope ≠ 0

F = (R²/k) / [(1-R²)/(n-k-1)]

With df₁ = k and df₂ = n - k - 1

Overall F-Test

Model: R² = 0.60, n = 100, k = 3 predictors

F = (0.60/3) / (0.40/96) = 0.20 / 0.00417 = 48.0

With df₁ = 3, df₂ = 96, critical F ≈ 2.70

Since 48.0 is much greater than 2.70, model is significant.


Individual Coefficient Tests

T-Test for Each Coefficient

H₀: βᵢ = 0 H₁: βᵢ ≠ 0

t = bᵢ / SE(bᵢ)

With df = n - k - 1

A significant t-test means that variable contributes to the model beyond other variables.


Multicollinearity

Multicollinearity occurs when predictors are highly correlated with each other.

Problems Caused

  • Unstable coefficient estimates
  • Large standard errors
  • Coefficients may have “wrong” signs
  • Hard to determine individual variable importance

Detection: Variance Inflation Factor (VIF)

VIF

VIFⱼ = 1 / (1 - Rⱼ²)

Where Rⱼ² is R² from regressing xⱼ on all other predictors

VIF ValueInterpretation
1No correlation with other predictors
1-5Moderate correlation
5-10High correlation (concerning)
>10Severe multicollinearity

Solutions

  • Remove one of the correlated variables
  • Combine variables (create index)
  • Use principal component analysis
  • Use ridge regression or LASSO

Model Building Strategies

1. Backward Elimination

Start with all variables, remove least significant one at a time.

2. Forward Selection

Start with no variables, add most significant one at a time.

3. Stepwise Selection

Combination: add and remove variables based on significance.

4. Theory-Based

Include variables based on domain knowledge, regardless of significance.


Categorical Predictors (Dummy Variables)

Categorical variables need to be converted to dummy variables.

Dummy Variables

Region with 3 categories: North, South, West

Create k-1 = 2 dummy variables:

  • D_South = 1 if South, 0 otherwise
  • D_West = 1 if West, 0 otherwise
  • North is the reference category (when both dummies = 0)

Model: Y = 100 + 5(D_South) + (-3)(D_West)

  • North: Y = 100 + 5(0) + (-3)(0) = 100
  • South: Y = 100 + 5(1) + (-3)(0) = 105
  • West: Y = 100 + 5(0) + (-3)(1) = 97

Interpretation: South is 5 units higher than North; West is 3 units lower than North.


Interaction Terms

Interaction occurs when the effect of one variable depends on another.

Interaction Model

ŷ = b₀ + b₁x₁ + b₂x₂ + b₃(x₁ × x₂)

Interaction Effect

Income = 20,000 + 3,000(Education) + 2,000(Experience) + 500(Education × Experience)

The effect of education depends on experience:

  • At Experience = 0: Effect of education = 3,000
  • At Experience = 10: Effect of education = 3,000 + 500(10) = 8,000

More experience amplifies the benefit of education!


Model Diagnostics

Check These:

  1. Linearity: Residuals vs. fitted values plot
  2. Normality: Q-Q plot of residuals
  3. Homoscedasticity: Constant variance in residual plot
  4. Independence: No patterns in residual sequence
  5. Influential points: Cook’s distance, leverage

Remedies for Violations:

  • Non-linearity: Transform variables or add polynomial terms
  • Non-normality: Transform Y variable
  • Heteroscedasticity: Transform Y or use weighted least squares
  • Outliers: Investigate, possibly remove with justification

Predictions and Confidence Intervals

Confidence Interval for Mean Response

Narrower interval: Predicting the average Y at given X values

Prediction Interval for Individual

Wider interval: Predicting Y for a specific new observation

(Includes both uncertainty in mean and individual variation)


Summary

In this lesson, you learned:

  • Multiple regression includes several predictor variables
  • Coefficients are partial effects (controlling for other variables)
  • Adjusted R² penalizes for adding variables
  • F-test assesses overall model significance
  • Multicollinearity causes unstable coefficients; check with VIF
  • Dummy variables encode categorical predictors
  • Interactions capture when effects depend on other variables
  • Always check diagnostics and assumptions

Practice Problems

1. A model has R² = 0.70, n = 50, k = 4. Calculate adjusted R².

2. Model: GPA = 1.5 + 0.3(StudyHours) + 0.2(Sleep) - 0.1(Social) a) Interpret the coefficient for StudyHours b) Predict GPA for: StudyHours=5, Sleep=7, Social=3

3. Two predictors have correlation r = 0.95. What problem might this cause?

4. A variable has VIF = 8. What does this mean and what should you do?

Click to see answers

1. R²_adj = 1 - (1-0.70)(50-1)/(50-4-1) = 1 - (0.30)(49)/45 = 1 - 14.7/45 = 1 - 0.327 = 0.673

2. a) “Holding sleep and social hours constant, each additional hour of study is associated with a 0.3 increase in GPA.” b) GPA = 1.5 + 0.3(5) + 0.2(7) - 0.1(3) = 1.5 + 1.5 + 1.4 - 0.3 = 4.1 (would cap at 4.0 in reality)

3. Multicollinearity - the predictors are highly correlated, which can cause:

  • Unstable coefficient estimates
  • Large standard errors
  • Difficulty interpreting individual effects

4. VIF = 8 indicates high multicollinearity - this predictor is highly correlated with other predictors.

Options:

  • Remove this variable or a correlated variable
  • Combine correlated variables
  • Use regularization (ridge regression)
  • Be cautious interpreting this coefficient

Next Steps

Continue your regression journey:

Advertisement

Was this lesson helpful?

Help us improve by sharing your feedback or spreading the word.