MANOVA

Written by:

Linnea Eriksson

*Original version written by Christoffer Åkesson

Quick facts

Number of variables
One group variable (x)
Two or more test variables (y)

Scales of variable(s)
Group variable: categorical
Test variables: continuous

Like ANOVA, MANOVA is used to test the significance of group differences. However, MANOVA can include several dependent variables, whereas ANOVA can handle only one dependent variable.

For example, you could use a MANOVA to investigate whether salary income and number of weekly work hours differ according to age categories (i.e. your test variables would be “income” and “number of weekly work hours”, while “age category” would be your group variable). Alternatively, you could use a MANOVA to investigate whether math and science performance differ based on test anxiety levels amongst students (i.e. your test variables would be “math test score” and “science test score”, while your group variable would be “test anxiety level”).

Note
MANOVA can be seen as a combination of ANOVA and two or more t-tests. Accordingly, advantages are that you can compare more than two groups and that the test variables are mutually adjusted for.

Assumptions

First, you have to check your data to see that the assumptions behind MANOVA hold. If your data ‘passes’ these assumptions, you will have a valid result.

Checklist

Continuous and normally distributed test variables	Your test variables should be continuous (i.e. interval/ratio) and normally distributed. For example: Income, height, weight, number of years of schooling, and so on. Although they are not really continuous, it is still very common to use ratings as continuous variables, such as: “How satisfied with your income are you?” (on a scale 1-10) or “To what extent do you agree with the previous statement?” (on a scale 1-5).
Two or more unrelated categories in the group variable	Your group variable should be categorical (i.e. nominal or ordinal) and consist of two or more groups. Unrelated means that the groups should be mutually excluded: no individual can be in more than one of the groups. For example: low vs. medium vs. high educational level; liberal vs. conservative vs. socialist political views; or poor vs. fair, vs. good vs. excellent health; and so on.
Equal variance	The variance in the test variables should be equal across the groups of the group variable.
No outliers	An outlier is an extreme (low or high) value. For example, if most individuals have a test score between 40 and 60, but one individual has a score of 96 or another individual has a score of 1, this will distort the test.
Absence of multicollinearity	Your test variables should not be too correlated. A good rule of thumb is that no correlation should be above r = 0.90.

Function

Basic command

manova testvars = groupvar

Explanations
`testvars`	Insert the name of the test variables
`groupvar`	Insert the name of the group variable

More information
help manova

Practical example

Dataset

StataData1.dta

Variable name	gpa
Variable label	Grade point average (Age 15, Year 1985)
Value labels	N/A

Variable name	cognitive
Variable label	Cognitive test score (Age 15, Year 1985)
Value labels	N/A

Variable name	skipped
Variable label	Skipped class (Age 15, Year 1985)
Value labels	1=Never 2=Sometimes 3=Often

manova gpa cognitive = skipped

In this example, we investigate whether grade point average and cognitive test scores differ between those who never, sometimes, and often have skipped class. Our null hypothesis is that there is no difference.

Stata provides four test statistics by default (listed above the table). The most commonly used criterion is Wilks’ Lambda and this is what will be used in this example. Thus, we need to consult the Prob>F column along the Wilks’ Lambda (W) row to determine whether the null hypothesis should be rejected.

As can be seen, the F statistic is 118.53. The corresponding p-value is 0.0000 (i.e. below 0.05), meaning that there is statistically significant difference in both grade point average and cognitive test scores between individuals who have never, sometimes, and often skipped class. In other words, we can reject the null hypothesis.

Postestimation commands

There are many different postestimation commands that you can apply to MANOVA.

More information
help manova postestimation

For example, it is probably relevant to obtain the adjusted mean differences between the groups. We can use the postestimation command contrast to achieve this.

First, we can ask for the mean differences in gpa:

contrast r.skipped, equation(gpa)

In the column called Contrast, we see that the mean difference in grade point average between those who sometimes have skipped class and those who have never skipped class is -0.198. The mean difference between those who often have skipped class and those who have never skipped class is -0.383.

And then we can obtain the mean differences in cognitive:

contrast r.skipped, equation(cognitive)

Here, we see that the mean difference in cognitive test scores between those who sometimes have skipped class versus those who have never skipped class is -3.303. The mean difference between those who often have skipped class versus those who have never skipped class is -4.460.

We can also use the postestimation command margins, which gives us predicted means for each of the groups.

First, we get the predicted means in gpa:

margins skipped, predict(equation(gpa))

Looking at the column called Margin, we see that the predicted mean in grade point average for individuals who have never skipped class (3.330) is higher than those for individuals who sometimes (3.132), and often have skipped class (2.946), thus confirming what we got with contrast.

And then we can get the predicted means in cognitive:

margins skipped, predict(equation(cognitive))

Here, we see that the predicted mean in cognitive test scores for individuals who have never skipped class (310.950) is higher than those for individuals who sometimes (307.647), and often have skipped class (306.489), thus confirming what we got with contrast.