Function

Written by:

Ylva B Almquist

Basic command

mlogit depvar indepvars

Useful options

mlogit depvar indepvars, rrr 
mlogit depvar indepvars, rrr b(x)

Explanations
`depvar`	Insert the name of the y-variable.
`indepvars`	Insert the name of the x-variable(s) that you want to use.
`rrr`	Produces relative risk ratios.
`b(x)`	Specify the value of the base outcome. By default, the category with the most observations is chosen.

Short names
`b`	Base outcome

More information
help mlogit

Note
The mlogit command produces log relative risk, unless otherwise specified.

A walk-through of the output

When we perform a multinomial regression in Stata, the table looks like this:

In this example, yvar is a nominal variable with three categories, whereas xvar1 is a binary (0/1) variable and xvar2 is a continuous variable ranging between 100 and 500.

The upper part of the table shows a model summary. This is what the different rows mean:

Log likelihood	This value does not mean anything in itself, but can be used if we would like compare nested models.
Number of obs	The number of observations included in the model.
LR chi2(x)	The likelihood ratio (LR) chi-square test. The number within the brackets shows the degrees of freedom (one per variable).
Prob >chi2	Shows the probability of obtaining the chi-square statistic given that there is no statistical effect of the x-variables on y. If the p-value is below 0.05, we can conclude that the overall model is statistically significant.
Pseudo R2	A type of R-squared value. Seldom used.

The lower part of the table presents the parameter estimates from the analysis.

	The first column lists the y-variable on top, followed by our x-variable(s). We get one set of x-variables per level of the y-variable (always in comparison to the base outcome).
RRR	These are the relative risk ratios.
Std. Err.	The standard errors associated with the coefficient.
Z	Z-value (coefficient divided by the standard error of the coefficient).
P>\|z\|	P-value.
[95% Conf. Interval]	95% confidence intervals (lower limit and upper limit).

The analytical sample used for the examples

In the subsequent sections, we will use the following variables:

Dataset

StataData1.dta

Variable name	marstat40
Variable label	Marital status (Age 40, Year 2010)
Value labels	1=Married 2=Unmarried 3=Divorced 4=Widowed

Variable name	gpa
Variable label	Grade point average (Age 15, Year 1985)
Value labels	N/A

Variable name	sex
Variable label	Sex
Value labels	0=Man 1=Woman

Variable name	educ
Variable label	Educational level (Age 40, Year 2010)
Value labels	1=Compulsory 2=Upper secondary 3=University

sum marstat40 gpa sex educ

We define our analytical sample through the following command:

gen pop_multinom=1 if marstat40!=. & gpa!=. & sex!=. & educ!=.

This means that new the variable pop_multinom gets the value 1 if the four variables do not have missing information. In this case, we have 8,409 individuals that are included in our analytical sample.

tab pop_multinom