Quick facts
Number of variables
Two (reflecting repeated measurement points)
Scales of variable(s)
Continuous
Introduction
For the independent samples t-test, you were supposed to have two groups for which you compare the mean. For the paired samples t-test, you instead have two measurements of the same variable, and you look at whether there is a change from one measurement point to the other.
| Example |
![]() | ![]() |
| Happiness score before summer vacation | Happiness score after summer vacation |
| We are interested in seeing whether people’s happiness differ before and after summer vacation. We measure their happiness score before they leave for vacation, and then once again when they come back. Thus, we have two measurement points for the same sample of individuals. |
| Note The paired samples t-test is sometimes referred to as the paired t-test or the dependent t-test. |
T-distribution
The paired samples t-test assumes a t-distribution.
The t-distribution is a theoretical distribution that looks rather similar to a normal distribution but with longer tails. In other words, the curve is flatter than a normal distribution. The larger the sample size, the more similar the t-distribution will be to a normal distribution.
| Note The fact that the t-distribution is a theoretical distribution means that it is unknown. In other words, we population parameter is unknown (see The “unknown population parameter”). |
T-statistic
A t-test will produce a t-statistic (t). This is a standardised value that is calculated based on the study sample we have.
The t-distribution is based on the assumption that the null hypothesis is true (see Hypotheses). A t-statistic that is 0 means that the result from the t-test exactly reflects the null hypothesis (i.e, there is no difference between the measurement points). The higher the t-value, the further we get from the null hypothesis.
A t-statistic does not mean so much in itself. It is difficult to directly assess whether is is high or not.
Degrees of freedom
Degrees of freedom is a rather tricky concept to make sense of. Applied to the paired samples t-test, degrees of freedom is the same as the number of observations (i.e., individuals) minus 1 (n-1).
P-value
When performing a paired samples t-test, we want to examine whether if there is sufficient evidence to reject the null hypothesis (which stipulates that there is no difference between the measurement points).
The higher the t-statistic, the lower the p-value. A p-value that is lower than 0.05 means that we can reject the null hypothesis.
In Stata, there are three p-values that are reported when we perform an independent samples t-test:
| Ha: diff != 0 | Two-sided t-test. |
| Ha: diff < 0 | One-sided t-test (left tail) |
| Ha: diff > 0 | One-sided t-test (right tail) |
| Note Generally, we focus on the two-sided t-test since we make no assumption of the direction of the change, i.e., we do not specifically assume that there is either a lower or higher value at Time A compared to Time B. For the one-sided (left tail) t-test, we assume that there is a lower mean value at Time A compared to Time B. For the one-sided (right tail) t-test, we assume that there is a higher mean value at Time A compared to Time B. For statistical reasons, the one-sided t-tests increase the power to obtain p-values below 0.05, but we also risk missing changes that go in the opposite direction. |
Assumptions
First, you have to check your data to see that the assumptions behind the paired samples t-test hold. If your data “passes” these assumptions, you will have a valid result.
Below is a checklist for these assumptions.
| Continuous variables | Your two variables should be continuous (i.e. interval/ratio). For example: Income, height, weight, number of years of schooling, and so on. Although they are not really continuous, it is still very common to use ratings as continuous variables, such as: “How satisfied with your income are you?” (on a scale 1-10) or “To what extent do you agree with the previous statement?” (on a scale 1-5). |
| Two measurement points | Your two variables should reflect one single phenomenon, but this phenomenon is measured at two different time points for each individual. |
| Normally distributed variables | Both variables need to be approximately normally distributed. Use a histogram to check (see Histogram). |
| No outliers in the comparison between the two measurement points | For example, if one individual has an extremely low value at the first measurement point and an extremely high value at the second measurement point (or vice versa), this will distort the test. Use a scatterplot to check (see Scatterplot). |
Function
| Basic command |
|
| Explanations | |
| Insert the name of the variable for the first measurement point. |
testvar2 | Insert the name of the variable for the second measurement point. |
More informationhelp ttest |
Practical example
| Dataset |
| StataData1.dta |
| Variable name | unemp_42 |
| Variable label | Days in unemployment (Age 42, Year 2012) |
| Value labels | N/A |
| Variable name | unemp_43 |
| Variable label | Days in unemployment (Age 43, Year 2013) |
| Value labels | N/A |
| Note Since the variables are extremely skewed (due to a lot of zeros), we are restricting the analysis to those who did not have the value 0 at age 42. Hence, the part stating “ if unemp_42!=0” in the command below. |
ttest unemp_42==unemp_43 if unemp_42!=0 |

To start with, the mean is much higher at age 42 (147.998) compared to age 43 (38.625), which is a difference of 109.372.
The t-statistic in this example is 31.0816, with 1057 degrees of freedom.
The corresponding p-value is 0.0000 (look below “Ha: mean(diff) != 0”). This is below 0.05, which allows us to reject the null hypothesis (which postulates that there is no mean difference between the two measurement points).
In other words, there is a significant difference between the two measurement points, suggesting that the mean number of days in unemployment is significantly lower at age 43 compared to age 42 in this example.

