Practical example with linear regression

For this example, we will use Approach A to conduct an interaction analysis based on linear regression. We want to see if sex (z) moderates the association between grade point average (x) and income (y).

Dataset
StataData1.dta
Variable nameincome
Variable labelAnnual salary income (Age 40, Year 2010)
Value labelsN/A
Variable namegpa
Variable labelGrade point average (Age 15, Year 1985)
Value labelsN/A
Variable namesex
Variable labelSex
Value labels0=Man
1=Woman

Define the analytical sample

We start by defining the analytical sample:

gen pop_interact1=1 if income!=. & gpa!=. & sex!=.

Let us have a quick look at the variables:

sum income gpa sex if pop_interact1==1

Simple regression models

First, we will run the simple models, one for gpa and income, and one for sex and income.

reg income gpa if pop_interact1==1

reg income sex if pop_interact1==1

There are statistically significant associations in both simple models. More specifically, the B coefficient for gpa is 37281 (95 % CI: 33653 to 40909) and the B coefficient for sex is -76479 (95% CI: -81299 to -71659).

Multiple regression model

Next, we run a model with both independent variables included:

reg income gpa sex if pop_interact1==1

We can note that the B coefficients increase quite a lot (i.e. become further from 0). 

Multiple regression model with interaction effect

In this step, we will include the interaction term using Approach 2 (two hashtags mean that we specify the main effects and the interaction effect at the same time):

reg income c.gpa##i.sex if pop_interact1==1

In the table above, we can see that the estimate for the interaction term has a p-value below 0.05 (0.000). This suggests that there is a statistically significant interaction effect between grade point average and sex on income.

Interpretation

How can we understand this interaction effect that we found? Since our x-variable – sex – is binary, the easiest strategy for gaining more insight would be to do sex-specific analyses of the association between gpa and income

We start with a model for men, and then continue with the same for women.

reg income gpa if pop_interact1==1 & sex==0

reg income gpa if pop_interact1==1 & sex==1

The sex-specific models show that the slope in income according to grade point average is steeper among men compared to among women.

Illustration

In order to illustrate the interaction, we can use the margins command. The first step is re-run the model. The quietly option is included in the beginning of the command to suppress the output.

quietly reg income c.gpa##i.sex if pop_interact1==1

Then we can produce the margins (the quietly option is included here as well):

quietly margins sex, at(gpa=(1 5))

Note
We specify 1 and 5 here since they represent the lowest and highest values for the variable gpa.

And then, finally, it is time to produce the marginsplot:

marginsplot

Summary
There is a positive, statistically significant association between grade point average at age 15 and income at age 40. While this association exists among men and women alike, the slope is steeper among men.