Non-parametric alternative: Wilcoxon signed rank test

Written by:

Ylva B Almquist

Quick facts

Number of variables
Two (reflecting repeated measurement points)

Scales of variable(s)
Continuous or categorical (ordinal)

Introduction

It is not uncommon that at least one of the assumptions behind the paired samples t-test is violated. While you most commonly will be able to conduct the test anyway, it is important to be aware of the possible problems.

Alternatively, you can use the Wilcoxon signed rank test, which is a non-parametric paired samples t-test that relaxes some of the assumptions that were presented earlier.

The Wilcoxon signed rank test is specifically used when the variables are not sufficiently normally distributed (e.g., when you have a variables on the ordinal scale).

Note
This test is primarily suitable for samples with <= 200 observations. By default, if you then will obtain an exact p-value based on the actual randomization distribution of the test statistic. If you have more than 200 observations, you need to use the option called exact. The exact computation is only available for samples where n <= 2,000. Regardless of sample size, you always get an approximate p-value which is based on a normal approximation to the randomization distribution.

Z-distribution

The Wilcoxon signed rank tests assumes a z-distribution.

The z-distribution is a special form of a normal distribution, where the mean is 0 and the standard deviation is 1.

W-statistic and z-statistic

To perform the Wilcoxon signed rank test, rankings of the absolute paired differences of each observation (i.e., individual) are calculated. Here, a rank of 1 indicates the smallest difference in the score between the two measurement points. The ranks are then divided into positive and negative. The sums of these positive and negative ranks are compared and transformed into a w-statistic.

From the w-statistic, we can calculate a z-statistic.

P-value

For each z-statistic, there is a corresponding p-value. A p-value that is lower than 0.05 means that we can reject the null hypothesis (which stipulates that there is no difference between the measurement points).

Note
There is also something called “ties” that is relevant for the Wilcoxon signed rank test. It basically means that two individuals can share the same rank (because they have the same difference between the two measurement points). In this case, the calculation needs to be adjusted for ties.

Function

Basic command

signrank testvar1=testvar2

Useful options

signrank testvar1= testvar2, exact

Explanations
`testvar1`	Insert the name of the variable for the first measurement point.
`testvar2`	Insert the name of the variable for the first measurement point.
`exact`	Specifies that the exact p-value be computed in addition to the approximate p-value.

More information
help signrank

Practical example

Dataset

StataData1.dta

Variable name	unemp_42
Variable label	Days in unemployment (Age 42, Year 2012)
Value labels	N/A

Variable name	unemp_43
Variable label	Days in unemployment (Age 43, Year 2013)
Value labels	N/A

Note
Since the variables are extremely skewed (due to a lot of zeros), we are restricting the analysis to those who did not have the value 0 at age 42. Hence, the part stating “if unemp_42!=0” in the command below.

signrank unemp_42=unemp_43 if unemp_42!=0, exact

The z-statistic in this example is 23.991, with an approximate p-value (Prob > |z|) of 0.0000, and an exact p-value (Exact Prob) of 0.0000. Since the latter p-value is below 0.05, this allows us to reject the null hypothesis (which postulates that there is no difference in the distribution between the two measurement points), thus confirming what we also saw for the paired samples t-test.

Note
We tend to recommend a pragmatic approach to the choice between parametric and non-parametric t-tests. If you experience a violation against the parametric t-test (i.e., the paired samples t-test), we strongly encourage you to perform the non-parametric t-test (i.e., the Wilcoxon signed rank test) as a sensitivity analysis. If the latter leads the same conclusion, it is preferable to use the former since it is allows for further specifications. However, you might still exercise a bit of caution when it comes to reporting the exact mean differences.