Quick facts
Number of variables
One group variable
One test variable
Scales of variable(s)
Group variable: categorical with two values (binary)
Test variable: continuous or categorical (ordinal)
Introduction
It is not uncommon that at least one of the assumptions behind the independent samples t-test is violated. While you most commonly will be able to conduct the test anyway, it is important to be aware of the possible problems.
Alternatively, you can use the Mann-Whitney u-test, which is a non-parametric independent t-test that relaxes some of the assumptions that were presented earlier.
The Mann-Whitney u-test is specifically used when the test variable is not sufficiently normally distributed (e.g., when you have a test variable on the ordinal scale).
| Note The Mann-Whitney u-test is sometimes referred to as the Wilcoxon-Mann-Whitney test or the Wilcoxon Rank-Sum test. |
Z-distribution
The Mann-Whitney u-test assumes a z-distribution.
The z-distribution is a special form of a normal distribution, where the mean is 0 and the standard deviation is 1.
U-values and z-statistic
To perform the Mann-Whitney u-test, the rankings of the individual values first need to be determined. In other words, the test starts by ordering the values across the two groups and assigns each individual a rank. These rankings are then added up for each of the two groups and transformed into u-values.
From the u-values, we can calculate a z-statistic.
P-value
For each z-statistic, there is a corresponding p-value. A p-value that is lower than 0.05 means that we can reject the null hypothesis (which stipulates that there is no difference between the groups).
| Note There is also something called “ties” that is relevant for the Mann-Whitney u-test. It basically means that two individuals can share the same rank (because they have the same value for the test variable). In this case, the calculation needs to be adjusted for ties. |
Function
| Basic command |
|
| Explanations | |
| Insert the name of the variable that you want to test. |
groupvar | Insert the variable defining the two groups. |
More informationhelp ranksum |
Practical example
| Dataset |
| StataData1.dta |
| Variable name | cognitive |
| Variable label | Cognitive test score (Age 15, Year 1985) |
| Value labels | N/A |
| Variable name | sex |
| Variable label | Sex |
| Value labels | 0=Man 1=Woman |
ranksum cognitive, by(sex) |

The z-statistic in this example is 5.100, with a p-value of 0.0000. Since the p-value is below 0.05, this allows us to reject the null hypothesis (which postulates that there is no mean difference between the two groups).
In other words, there is a significant difference in mean cognitive test scores between men and women in this example, to the advantage of men (just like the previous t-test also showed).
| Note We tend to recommend a pragmatic approach to the choice between parametric and non-parametric t-tests. If you experience a violation against the parametric t-test (i.e., the independent samples t-test), we strongly encourage you to perform the non-parametric t-test (i.e., the Mann-Whitney u-test) as a sensitivity analysis. If the latter leads the same conclusion, it is preferable to use the former since it is allows for further specifications. However, you might still exercise a bit of caution when it comes to reporting the exact mean differences. |