6 Interactions and Non-Additivity

6.1 Overview

Up to this point, we’ve assumed that the linear regression model is additive – the effect of one predictor on \(Y\) doesn’t depend on the value of another predictor. But this assumption is often unrealistic. Does the effect of authoritarianism on institutional trust differ across partisan groups? Does the relationship between income and vote margins change depending on a community’s racial composition? These are questions about interactions.

In this chapter we develop the interaction model, show how it generalizes the additive (intercept-shift) model from the previous chapter, and apply it to both the Western States Survey and the Arizona precinct data.

6.2 The Arizona Precinct Data

Throughout this chapter and the next, we use the Arizona precinct data – 1,688 precincts from the 2024 presidential election, with voter registration, turnout, and Census demographics linked at the tract level.

Data: Arizona Precincts

This data comes from the Arizona Secretary of State voter file, merged with ACS tract-level demographics. Each row is a precinct. The dependent variable is Trump’s margin over Harris (positive = Trump advantage). Predictors come from ACS tract-level demographics: median household income, Latino population share, median age, and the Gini index of income inequality.

Table 6.1

load("precinct_voter_summary.rda")
load("precinct_tract_data.rda")

# Convert raw ACS counts to percentages
precinct_voter_summary <- precinct_voter_summary |>
  dplyr::mutate(
    pct_latino = (tract_acs_latino / tract_acs_total_population) * 100,
    pct_white  = (tract_acs_non_latino_white / tract_acs_total_population) * 100
  )

head(precinct_voter_summary[, c("dos_precinct_key", "trump_harris_margin",
                                 "tract_acs_median_household_income",
                                 "pct_latino", "tract_acs_median_age",
                                 "tract_acs_gini_index")])

# A tibble: 6 × 6
  dos_precinct_key trump_harris_margin tract_acs_median_household_i…¹ pct_latino
  <chr>                          <dbl>                          <dbl>      <dbl>
1 0001 ACACIA                   0.0430                          69005       31.4
2 0002 ACOMA                    0.254                           95395       21.3
3 0003 ACUNA                   -0.508                           49849       93.7
4 0004 ADOBE                    0.115                           78531       30.8
5 0005 ADORA                    0.194                          156354       15.5
6 0006 AGRITOPIA                0.0943                         101179       15.7
# ℹ abbreviated name: ¹tract_acs_median_household_income
# ℹ 2 more variables: tract_acs_median_age <dbl>, tract_acs_gini_index <dbl>

dplyr

The code above uses several dplyr verbs:

mutate() creates new columns (or modifies existing ones). Here we divide the raw Latino count by total population to get a percentage.
select() chooses specific columns for subsetting.
filter() keeps only rows that meet a condition (e.g., filter(!is.na(trump_harris_margin))).
group_by() + summarize() computes summary statistics within groups.
The pipe |> passes the result of one step as the first argument to the next.

Figure 6.1 shows the geographic distribution of Trump’s margin across Arizona precincts.

Figure 6.1: Arizona precincts colored by Trump-Harris margin. Hover for precinct details.

A baseline additive model: can tract-level ACS demographics predict the Trump-Harris margin?

Table 6.2

fit_az <- lm(trump_harris_margin ~ tract_acs_median_household_income +
               pct_latino + tract_acs_median_age + tract_acs_gini_index,
             data = precinct_voter_summary)
summary(fit_az)


Call:
lm(formula = trump_harris_margin ~ tract_acs_median_household_income + 
    pct_latino + tract_acs_median_age + tract_acs_gini_index, 
    data = precinct_voter_summary)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.0910 -0.1948 -0.0197  0.1774  1.1668 

Coefficients:
                                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)                        1.567e-02  6.626e-02   0.236    0.813    
tract_acs_median_household_income  1.213e-06  2.187e-07   5.545 3.40e-08 ***
pct_latino                        -1.575e-03  3.810e-04  -4.134 3.75e-05 ***
tract_acs_median_age               1.048e-02  6.768e-04  15.490  < 2e-16 ***
tract_acs_gini_index              -1.199e+00  1.115e-01 -10.756  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2896 on 1674 degrees of freedom
  (9 observations deleted due to missingness)
Multiple R-squared:  0.2631,    Adjusted R-squared:  0.2613 
F-statistic: 149.4 on 4 and 1674 DF,  p-value: < 2.2e-16

Figure 6.2: Predicted vs. observed Trump-Harris margin across Arizona precincts.

6.3 Additive to Interactive

6.3.1 The Additive Model

In the previous chapter, we saw how dummy variables produce intercept shifts – parallel lines with different baselines. The additive model assumes the slope is the same for every group. For the Western States Survey data:

\[ Y_i = \alpha + \gamma_{\text{Rep}} D_{\text{Rep}} + \gamma_{\text{Dem}} D_{\text{Dem}} + \beta_{\text{Auth}} X_{\text{Auth}} + e_i \]

6.3.2 The Interaction Model

Recall the additive model assumes the effect of authoritarianism on institutional trust is the same for Republicans, Democrats, and Independents. The intercept shifts, but the slope doesn’t. What if the effect of authoritarianism differs across party groups? The additive model cannot capture this:

\[\begin{eqnarray*} E(Y_i \mid \text{Republican}) &=& (\alpha + \gamma_{\text{Rep}}) + \beta_{\text{Auth}} X_{\text{Auth}} \\ E(Y_i \mid \text{Democrat}) &=& (\alpha + \gamma_{\text{Dem}}) + \beta_{\text{Auth}} X_{\text{Auth}} \\ E(Y_i \mid \text{Independent}) &=& \alpha + \beta_{\text{Auth}} X_{\text{Auth}} \end{eqnarray*}\]

Here, \(\frac{\partial Y}{\partial X_{\text{Auth}}} = \beta_{\text{Auth}}\) . We can relax this by estimating an interactive model that allows the slope on authoritarianism to differ by party:

\[ Y_i = \alpha + \gamma_{\text{Rep}} D_{\text{Rep}} + \gamma_{\text{Dem}} D_{\text{Dem}} + \beta_{\text{Auth}} X_{\text{Auth}} + \delta_{\text{Rep}} (X_{\text{Auth}} \times D_{\text{Rep}}) + \delta_{\text{Dem}} (X_{\text{Auth}} \times D_{\text{Dem}}) + e_i \]

So what we’re doing is just creating two additional variables. One is defined by multiplying authoritarianism by the Republican dummy, and the other is defined by multiplying authoritarianism by the Democrat dummy.

We now include these in the model.

\[\begin{eqnarray*} E(Y_i \mid \text{Independent}) &=& \alpha + \beta_{\text{Auth}} X_{\text{Auth}} \\ E(Y_i \mid \text{Republican}) &=& (\alpha + \gamma_{\text{Rep}}) + (\beta_{\text{Auth}} + \delta_{\text{Rep}}) X_{\text{Auth}} \\ E(Y_i \mid \text{Democrat}) &=& (\alpha + \gamma_{\text{Dem}}) + (\beta_{\text{Auth}} + \delta_{\text{Dem}}) X_{\text{Auth}} \end{eqnarray*}\]

But note what happens when we take the derivative with respect to authoritarianism. The marginal effect of authoritarianism now depends on party identification:

\(\beta_{\text{Auth}}\) is the slope of authoritarianism for Independents (the reference group)
\(\delta_{\text{Rep}}\) is the difference in the authoritarianism slope between Republicans and Independents
\(\delta_{\text{Dem}}\) is the difference in the authoritarianism slope between Democrats and Independents
\(\gamma_{\text{Rep}}\) and \(\gamma_{\text{Dem}}\) are the intercept differences when \(X_{\text{Auth}} = 0\)

The lines are no longer parallel – each group gets its own intercept and its own slope.

And note that the marginal effect of authoritarianism is now a function of party identification: \[\frac{\partial Y}{\partial X_{\text{Auth}}} = \beta_{\text{Auth}} + \delta_{\text{Rep}} D_{\text{Rep}} + \delta_{\text{Dem}} D_{\text{Dem}}\]

Always Remember to Include Lower-Order Terms

It’s important to always include the lower-order constituent terms in an interactive model.

\[Y_i = \alpha + \beta_{\text{Auth}} X_{\text{Auth}} + \delta_{\text{Rep}} (X_{\text{Auth}} \times D_{\text{Rep}}) + \delta_{\text{Dem}} (X_{\text{Auth}} \times D_{\text{Dem}}) + e_i\]

This omits \(\gamma_{\text{Rep}}\) and \(\gamma_{\text{Dem}}\), forcing the assumption that all three groups share the same intercept (i.e., no group differences when authoritarianism = 0). Conversely:

\[ Y_i = \alpha + \gamma_{\text{Rep}} D_{\text{Rep}} + \gamma_{\text{Dem}} D_{\text{Dem}} + \delta_{\text{Rep}} (X_{\text{Auth}} \times D_{\text{Rep}}) + \delta_{\text{Dem}} (X_{\text{Auth}} \times D_{\text{Dem}}) + e_i \]

This omits \(\beta_{\text{Auth}}\), forcing the slope to be zero for Independents. We can always test whether \(\gamma_{\text{Rep}} = \gamma_{\text{Dem}} = 0\) or \(\beta_{\text{Auth}} = 0\), but the full model should include all constituent terms.

Table 6.3

# Additive model (restricted)
fit_additive <- lm(institutional_trust ~ authoritarianism + republican + democrat, data = wss20)

# Interactive model (unrestricted)
fit_interaction <- lm(institutional_trust ~ authoritarianism * republican + authoritarianism * democrat, data = wss20)
summary(fit_interaction)


Call:
lm(formula = institutional_trust ~ authoritarianism * republican + 
    authoritarianism * democrat, data = wss20)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.58654 -0.10874  0.00852  0.12260  0.58245 

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)                  0.40529    0.01215  33.348   <2e-16 ***
authoritarianism             0.04905    0.02221   2.208   0.0273 *  
republican                   0.16678    0.01637  10.188   <2e-16 ***
democrat                     0.03461    0.01379   2.510   0.0121 *  
authoritarianism:republican -0.03458    0.02819  -1.227   0.2201    
authoritarianism:democrat    0.02635    0.02579   1.021   0.3071    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1835 on 3425 degrees of freedom
  (169 observations deleted due to missingness)
Multiple R-squared:  0.1072,    Adjusted R-squared:  0.1059 
F-statistic: 82.25 on 5 and 3425 DF,  p-value: < 2.2e-16

# F-test: do the interaction terms jointly improve the model?
anova(fit_additive, fit_interaction)

Analysis of Variance Table

Model 1: institutional_trust ~ authoritarianism + republican + democrat
Model 2: institutional_trust ~ authoritarianism * republican + authoritarianism * 
    democrat
  Res.Df    RSS Df Sum of Sq      F Pr(>F)  
1   3427 115.57                             
2   3425 115.30  2   0.26505 3.9366 0.0196 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Figure 6.3: Interaction between authoritarianism and party identification. Unlike the additive model, slopes now differ across groups.

Compare Figure 6.3 to the additive model from the previous chapter – the lines are no longer parallel. The F-test tells us whether the improvement in fit from allowing different slopes is statistically meaningful.

6.4 Continuous-by-Continuous Interactions

The interaction we just estimated involved a continuous variable (authoritarianism) and a categorical variable (party ID). But interactions are not limited to this combination. We can also interact two continuous variables. The logic is the same – the effect of one variable depends on the value of another – but the interpretation requires more care because neither variable “turns on and off” like a dummy does.

Consider the Arizona precinct data. We might ask: does the effect of income on Trump’s margin depend on the community’s Latino population share? In a predominantly white precinct, higher income might push margins in one direction; in a predominantly Latino precinct, the income effect might be different – or even reversed.

The model is:

\[ \text{Margin}_i = \beta_0 + \beta_1 \text{Income}_i + \beta_2 \text{Latino}_i + \beta_3 (\text{Income}_i \times \text{Latino}_i) + \beta_4 \text{Age}_i + \beta_5 \text{Gini}_i + e_i \]

Now, what is the effect of income on the margin? Take the partial derivative with respect to income:

\[\frac{\partial \text{Margin}}{\partial \text{Income}} = \beta_1 + \beta_3 \times \text{Latino}_i\]

This is the key insight: the marginal effect of income is not a single number. It’s a function of percent Latino. At a precinct where \(\text{Latino} = 0\), the effect of income is just \(\beta_1\). At a precinct where \(\text{Latino} = 50\), it’s \(\beta_1 + 50\beta_3\). The interaction coefficient \(\beta_3\) tells you how much the income effect changes for each one-unit increase in percent Latino – but \(\beta_3\) by itself doesn’t tell you what the income effect actually is at any particular value of percent Latino. You need both \(\beta_1\) and \(\beta_3\) together.

This is why you can’t just look at the coefficient table and say “the interaction is significant, so the effect depends on Latino population.” You need to compute the marginal effect at specific values and plot it. Let’s do that.

Table 6.4

fit_interact_az <- lm(trump_harris_margin ~ tract_acs_median_household_income *
                        pct_latino + tract_acs_median_age + tract_acs_gini_index,
                      data = precinct_voter_summary)
summary(fit_interact_az)


Call:
lm(formula = trump_harris_margin ~ tract_acs_median_household_income * 
    pct_latino + tract_acs_median_age + tract_acs_gini_index, 
    data = precinct_voter_summary)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.09183 -0.19607 -0.02004  0.17629  1.16503 

Coefficients:
                                               Estimate Std. Error t value
(Intercept)                                   2.389e-02  6.700e-02   0.357
tract_acs_median_household_income             1.365e-06  2.850e-07   4.790
pct_latino                                   -9.457e-04  8.452e-04  -1.119
tract_acs_median_age                          1.040e-02  6.839e-04  15.210
tract_acs_gini_index                         -1.231e+00  1.179e-01 -10.443
tract_acs_median_household_income:pct_latino -1.057e-08  1.267e-08  -0.834
                                             Pr(>|t|)    
(Intercept)                                     0.721    
tract_acs_median_household_income            1.81e-06 ***
pct_latino                                      0.263    
tract_acs_median_age                          < 2e-16 ***
tract_acs_gini_index                          < 2e-16 ***
tract_acs_median_household_income:pct_latino    0.404    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2896 on 1673 degrees of freedom
  (9 observations deleted due to missingness)
Multiple R-squared:  0.2634,    Adjusted R-squared:  0.2612 
F-statistic: 119.6 on 5 and 1673 DF,  p-value: < 2.2e-16

6.4.1 Computing Marginal Effects

The marginal effect of income at any value of percent Latino is just the partial derivative we wrote above:

\[\frac{\partial \text{Margin}}{\partial \text{Income}} = \hat{\beta}_1 + \hat{\beta}_3 \times \text{Latino}\]

We plug in different values of percent Latino, compute the marginal effect at each one, and plot it.

Figure 6.4: Marginal effect of median household income on Trump-Harris margin at different levels of percent Latino. Shaded region is the 95% confidence interval. The red dashed line at zero indicates no effect.

How do we read Figure 6.4? The black line is the estimated marginal effect of income at each level of percent Latino. The shaded band is the 95% confidence interval. Where the band includes zero (the red dashed line), we cannot say the income effect is statistically distinguishable from zero at that level of Latino population. Where the band is entirely above or below zero, the income effect is statistically significant at the 5% level.

This is the right way to interpret a continuous interaction. The coefficient on the interaction term alone – \(\hat{\beta}_3\) – tells you the rate of change of the income effect as Latino percentage increases. But it doesn’t tell you what the income effect is at any particular value. You always need the full marginal effect calculation and the plot.

6.5 Summary

This chapter extended the additive regression model to allow interactions – cases where the effect of one predictor depends on the value of another. Key takeaways:

Categorical interactions allow different slopes for different groups (e.g., the effect of authoritarianism differs by party).
Continuous interactions allow the marginal effect of one variable to vary smoothly with another (e.g., the income effect changes with Latino population share).
Always include the lower-order constituent terms when specifying an interaction.
Use marginal effect plots to interpret continuous interactions – the coefficient on the interaction term alone is not sufficient.
The F-test (anova()) can formally test whether interaction terms jointly improve the model over the restricted additive specification.

In the next chapter, we examine what happens when a core assumption of the linear model – constant error variance – is violated. The Arizona precinct data will turn out to be a natural setting for this problem.