Multiple ordinal regression

Written by:

Ylva B Almquist

Quick facts

Number of variables
One dependent (y)
At least two independent (x)

Scales of variable(s)
Dependent: ordinal
Independent: categorical (nominal/ordinal) or continuous (ratio/interval)

Theoretical example

Example
Suppose we are interested to see if having young children (x), residential area (x), and income (x) is related to alcohol consumption (y). Having young children is measured as either 0=No young children and 1=Young children. Residential area has the values 1=Metropolitan, 2=Smaller city, and 3=Rural. We choose Metropolitan as our reference category. Income is measured as the yearly household income from salary in thousands of SEK (ranges between 100 and 700 SEK). Alcohol consumption has the values 1=None/low, 2=Medium, 3=High.

In the regression analysis, we get an OR for Young children that is 0.65. That means that those who have young children drink less alcohol. This association is adjusted for residential area and income.

With regards to residential area, we get an OR for Smaller city of 1.32, whereas the OR for Rural is 2.44. This suggests that those who live in a smaller city drink more alcohol, and so do those living in rural areas. These results are adjusted for having young children and income.

Finally, the OR for income is 0.95. This suggests that for every unit increase in income (i.e. for every additional one thousand SEK), the consumption of alcohol decreases. This association is adjusted for having young children and residential area.

Practical example

Dataset

StataData1.dta

Variable name	educ
Variable label	Educational level (Age 40, Year 2010)
Value labels	1=Compulsory 2=Upper secondary 3=University

Variable name	gpa
Variable label	Grade point average (Age 15, Year 1985)
Value labels	N/A

Variable name	bullied
Variable label	Exposure to bullying (Age 15, Year 1985)
Value labels	0=No 1=Yes

Variable name	bestfriends
Variable label	Number of best friends (Age 15, Year 1985)
Value labels	1=No best friends 2=One best friend 3=Two best friends 4=Three best friends 5=Four or more best friends

sum educ gpa bullied bestfriends if pop_ordinal==1

ologit educ gpa bullied ib1.bestfriends if pop_ordinal==1, or

In this model, we have three x-variables: gpa, bullied, and bestfriends. When we put them together, their statistical effect on educ is mutually adjusted.

When it comes to the odds ratios, they have changed in comparison to the simple regression models. For example, the odds ratio for gpa has increased from 4.63 to 4.68 – however, this is really a minor change. The odds ratio for bullied has become close to 1 (from 0.71 to 0.97). Concerning the dummies of bestfriends, we see that all odds ratios are more or less around 1.

The association between the gpa and educ remains statistically significant (p<0.05) after mutual adjustment. The associations between bullied and educ on the one hand, and between bestfriends and educ on the other hand, have now reached statistically non-significant levels.

Note
A specific odds ratio from a simple ordinal regression model can increase when other x-variables are included. Usually, it is just “noise”, i.e. not any large increases, and therefore not much to be concerned about. But it can also reflect that there is something going on that we need to explore further. There are many possible explanations for increases in multiple regression models: a) We actually adjust for a confounder and then “reveal” the “true” statistical effect. b) There are interactions among the x-variables in their effect on the y-variable. c) There is something called collider bias (which we will not address in this guide) which basically mean that both the x-variable and the y-variable causes another x-variable in the model. d) The simple regression models and the multiple regression model are based on different samples. e) It can be due to rescaling bias (see Mediation analysis).

Summary
In the fully adjusted model, it can be observed that the association between grade point average at age 15 and the level of educational attainment at age 40, remains strong and statistically significant (OR=4.68; 95% CI=4.33-5.05). Exposure to bullying and number of best friends are, however, no longer associated with the outcome.

Estimates table and coefficients plot

If we have multiple models, we can facilitate comparisons between the regression models by asking Stata to construct estimates tables and coefficients plots. What we do is to run the regression models one-by-one, save the estimates after each, and than use the commands estimates table and coefplot.

The coefplot option is not part of the standard Stata program, so unless you already have added this package, you need to install it:

ssc install coefplot

As an example, we can include the three simple regression models as well as the multiple regression model. The quietly option is included in the beginning of the regression commands to suppress the output.

Run and save the first simple regression model:

quietly ologit educ gpa if pop_ordinal==1, or

estimates store model1

Run and save the second simple regression model:

quietly ologit educ bullied if pop_ordinal==1, or

estimates store model2

Run and save the third simple regression model:

quietly ologit educ ib1.bestfriends if pop_ordinal==1, or

estimates store model3

Run and save the multiple regression model:

quietly ologit educ gpa bullied ib1.bestfriends if pop_ordinal==1, or

estimates store model4

Produce the estimates table (include the option eform to show odds ratios):

estimates table model1 model2 model3 model4, eform

Produce the coefficients plot (include the option eform to show odds ratios):

coefplot model1 model2 model3 model4, eform

Note
You can improve the graph by using the Graph Editor to adjust the category and label names.