Quick facts
Number of variables
One group variable
One test variable
Scales of variable(s)
Group variable: categorical with two values (binary)
Test variable: continuous
Introduction
The independent samples t-test is a parametric method for comparing the mean of one variable between two (unrelated) groups.
| Example |
![]() | ![]() |
| Mean income salary among men | Mean income salary among women |
| Let us assume that you want to see if the income salary of teachers differs between men and women. The independent samples t-test can be used to compare the mean income between the two groups. |
| Note The independent samples t-test is sometimes referred to as the two sample t-test, independent t-test, or student’s t-test. |
T-distribution
The independent samples t-test assumes a t-distribution (see T-distribution).
T-statistic
A t-test will produce a t-statistic (t). This is a standardised value that is calculated based on the study sample we have.
The t-distribution is based on the assumption that the null hypothesis is true (see Hypotheses). A t-statistic that is 0 means that the result from the t-test exactly reflects the null hypothesis (i.e, there is no difference between the groups). The higher the t-value, the further we get from the null hypothesis.
A t-statistic does not mean so much in itself. It is difficult to directly assess whether is is high or not.
Degrees of freedom
Degrees of freedom is a rather tricky concept to make sense of. Applied to the t-test, degrees of freedom is the same as the number of observations (i.e., individuals) minus 1 (n-1).
P-value
When performing a t-test, we want to examine whether if there is sufficient evidence to reject the null hypothesis (which stipulates that there is no difference between the groups).
The higher the t-statistic, the lower the p-value. A p-value that is lower than 0.05 means that we can reject the null hypothesis.
In Stata, there are three p-values that are reported when we perform a t-test:
| Ha: diff != 0 | Two-sided t-test. |
| Ha: diff < 0 | One-sided t-test (left tail) |
| Ha: diff > 0 | One-sided t-test (right tail) |
| Note Generally, we focus on the two-sided t-test since we make no assumption of the direction of the association/relationship, i.e., we do not specifically assume that Group A has either a lower or a higher mean value than Group B. For the one-sided (left tail) t-test, we assume that Group A has a lower mean value than Group B. For the one-sided (right tail) t-test, we assume that Group B has a higher mean value than Group B. For statistical reasons, the one-sided t-tests increase the power to obtain p-values below 0.05, but we also risk missing associations that go in the opposite direction. |
Assumptions
First, you have to check your data to see that the assumptions behind the independent samples t-test hold. If your data “passes” these assumptions, you will have a valid result.
Below is a checklist for these assumptions.
| Continuous test variable | Your test variable should be continuous (i.e. interval/ratio). For example: Income, height, weight, number of years of schooling, and so on. Although they are not really continuous, it is still very common to use ratings as continuous variables, such as: “How satisfied with your income are you?” (on a scale 1-10) or “To what extent do you agree with the previous statement?” (on a scale 1-5). |
| Normal distribution | The test variable should be approximately normally distributed. Use a histogram to check (see Histogram). |
| Two unrelated categories in the group variable | Your group variable should be categorical and consist of only two groups. Unrelated means that the two groups should be mutually excluded: no individual can be in both groups. For example: men vs. women, employed vs. unemployed, low-income earner vs. high-income earner, and so on. |
| No outliers | An outlier is an extreme (low or high) value. For example, if most individuals have a test score between 40 and 60, but one individual has a score of 96 or another individual has a score of 1, this will distort the test. |
Functions
| Basic command |
|
| Explanations | |
| Insert the name of the variable that you want to test. |
groupvar | Insert the variable defining the two groups. |
More informationhelp ttest |
Practical example
| Dataset |
| StataData1.dta |
| Variable name | cognitive |
| Variable label | Cognitive test score (Age 15, Year 1985) |
| Value labels | N/A |
| Variable name | sex |
| Variable label | Sex |
| Value labels | 0=Man 1=Woman |
ttest cognitive, by(sex) |

To start with, the overall mean is 308.4708. As can be seen, men have a slightly higher mean value compared to women (311.943 vs. 304.9106; a difference of 7.032464).
We can note that the t-statistic (t) in this example is 4.5949, with 8877 degrees of freedom.
The corresponding p-value is 0.0000 (look below “Ha: diff != 0”). This is below 0.05, which allows us to reject the null hypothesis (which postulates that there is no mean difference between the two groups).
In other words, there is a significant difference in mean cognitive test scores between men and women in this example, to the advantage of men.

