Quick facts
Number of variables
One group variable
One test variable
Scales of variable(s)
Group variable: categorical
Test variable: continuous
Introduction
The one-way ANOVA is very similar to the independent samples t-test. The difference is that the one-way ANOVA allows you to have more than two categories in your group variable.
In other words, the one-way ANOVA is a parametric method for comparing the mean of one variable between two or more (unrelated) groups.
| Example |
![]() ![]() ![]() | ![]() ![]() ![]() ![]() | ![]() ![]() ![]() ![]() ![]() ![]() |
| Mean number of ice cones per week during May in Swedish children ages 5-10 | Mean number of ice cones per week during June in Swedish children ages 5-10 | Mean number of ice cones per week during July in Swedish children ages 5-10 |
| We might be interested in knowing whether these are monthly differences in ice cream consumption among small children. Accordingly, we can compare the mean number of consumed ice cones across these three months. |
| Note The one-way ANOVA is considered an omnibus test since it only tells whether whether there are significant differences overall, and not exactly which groups are different from the others. There are nonetheless post-hoc tests that can accomplish this. |
| Note ANOVA stands for “analysis of variance”. The one-way ANOVA is sometimes referred to as one-factor ANOVA, one-way analysis of variance, or between-subjects ANOVA. |
F-distribution
The one-way ANOVA assumes a F-distribution.
The F-distribution is a continuous probability distribution (similar to the chi-square distribution). It is positively skewed and bounded at zero (i.e., it cannot go below 0).
The shape of the distribution is determined by the degrees of freedom – for the numerator (df1) and for the denominator (df2). The fewer the degrees of freedom the more the peak of the distribution approaches 0.
F-statistic
An F-statistic is the ratio of two variances. As you may remember from an earlier chapter (see Variation), the variance is the average of squared deviations from the mean value.
More specifically, we need these two estimates of the variance:
| Variance between groups | SSbetween | The sum of squares that represents the variation between the groups. |
| Variance within groups | SSwithin | The sum of squares that represents the variation within groups that is due to chance. |
The F-statistic is then calculated by dividing the variance between groups with the variance within groups.
P-value
When performing a one-way ANOVA, we want to examine whether if there is sufficient evidence to reject the null hypothesis (which stipulates that there is no difference between the groups).
A p-value that is lower than 0.05 means that we can reject the null hypothesis.
Assumptions
First, you have to check your data to see that the assumptions behind the one-way ANOVA hold. If your data “passes” these assumptions, you will have a valid result.
Below is a checklist for these assumptions.
| Continuous test variable | Your test variable should be continuous (i.e. interval/ratio). For example: Income, height, weight, number of years of schooling, and so on. Although they are not really continuous, it is still very common to use ratings as continuous variables, such as: “How satisfied with your income are you?” (on a scale 1-10) or “To what extent do you agree with the previous statement?” (on a scale 1-5). |
| Normally distributed test variable | The test variable should be approximately normally distributed. Use a histogram to check (see Histogram). |
| Two or more unrelated categories in the group variable | Your group variable should be categorical (i.e. nominal or ordinal) and consist of two or more groups. Unrelated means that the groups should be mutually exclusive: no individual can be in more than one of the groups. For example: low vs. medium vs. high educational level; liberal vs. conservative vs. socialist political views; or poor vs. fair, vs. good vs. excellent health; and so on. |
| Equal variance | The variance in the test variable should be equal across the groups of the group variable. |
| No outliers | An outlier is an extreme (low or high) value. For example, if most individuals have a test score between 40 and 60, but one individual has a score of 96 or another individual has a score of 1, this will distort the test. |
Function
| Basic command |
|
| Useful options |
|
| Explanations | |
| Insert the name of the test variable. |
groupvar | Insert the name of the group variable. |
tab | Produce summary table. |
bonferroni | Reports the results from a Bonferroni multiple-comparison test. |
More informationhelp |
| Note There are many different postestimation commands that you can apply to ANOVA. These options are described here: help anova postestimation |
Practical example
| Dataset |
| StataData1.dta |
| Variable name | income |
| Variable label | Annual salary income (Age 40, Year 2010) |
| Value labels | N/A |
| Variable name | educ |
| Variable label | Educational level (Age 40, Year 2010) |
| Value labels | 1=Compulsory 2=Upper secondary 3=University |
oneway income educ, tab bonferroni |

The first table provides some summary statistics. Here we can see that the mean income for the different groups:
- Compulsory education: 164316.86.
- Upper secondary education: 178904.49.
- University education: 238989.77.
Next table gives the F-statistic, which in this example is 331.85. Then look under Prob > F. Here we get a p-value that is 0.0000. Since it is below 0.05, we can conclude that the means between the groups are not equal.
At the lower part of the same table, we get the results from Barlett’s test for equal variances. The null hypothesis for this test is that the variances are equal. Since we get a p-value (next to Prob>chi2) below 0.05, it suggests that the assumption of equal variances is violated.
| Note Violation against the assumption of equal variances often happen with large datasets like the one used in the example. Also, the test is rather sensitive to data which is not normally distributed (the income variable used here is slightly skewed). Therefore, it might be a good idea to also perform a non-parametric test (in this case, a Kruskal-Wallis ANOVA). |
The fact that the F statistic tell us that the means between the groups are not equal says very little about wherein the differences lie: which groups are different? To answer this, we can take a look at the third table, showing the results from the Bonferroni test. The first lines of entries for each combination represents the mean differences. The second lines of entries are Bonferroni-adjusted p-values. In this example, they are all 0.000 (which is below 0.05), suggesting that there are significant differences between all three groups.
