Calculating confidence intervals for descriptive statistics

By Ylva B Almquist

Now that we have discussed what confidence intervals are (and what they are not), we thought it would be good time to show how to calculate them for descriptive statistics.

For this purpose, we can use the commands ci, centile, and proportion.


5.5.1 Confidence intervals for means

Can be used for continuous variables with a normal distribution.

Function

Basic command
ci means varlist
Useful options
ci means varlist, level(#)
Explanations
varlist
Insert the name(s) of the variable(s) that you want to use
level(#)
Specify the confidence level. Default is 95.
More information
Commandhelp centile

Practical example

Dataset
StataData1.dta
Variable nameVariable labelValues and labels
gpaGrade point average (Age 15, Year 1985)

ci means gpa

In this example, we can see that the mean value for gpa is 3.18. The 95% confidence interval is 3.16-3.19.


5.5.2 Confidence intervals for median

Can be used for continuous variables (with a normal or skewed distribution).

Function

Basic command
centile varlist
Useful options
centile varlist, level(#)
Explanations
varlist
Insert the name(s) of the variable(s) that you want to use
level(#)
Specify the confidence level. Default is 95.
More information
Commandhelp centile

Practical example

Dataset
StataData1.dta
Variable nameVariable labelValues and labels
cognitiveCognitive test score (Age 15, 1985)

centile cognitive

In this example, we can see that the median cognitive test score is 312, and the 95% confidence interval is 312-316.


5.5.3 Confidence intervals for variances and standard deviations

Can be used for variables that are continuous.

Function

Basic command
ci variances varlist
Useful options
ci variances varlist, level(#)
ci variances varlist, sd level(#)
Explanations
varlist
Insert the name(s) of the variable(s) that you want to use
sd
Option to display confidence interval for standard deviation.
level(#)
Specify the confidence level. Default is 95.
More information
Commandhelp ci

Practical example

Dataset
StataData1.dta
Variable nameVariable labelValues and labels
cognitiveCognitive test score (Age 15, 1985)

ci variances cognitive

Here, the variance (5210) and its confidence interval (5061-5367) is shown.

ci variances cognitive, sd

This shows the standard deviation (72.18) and its 95% confidence interval (71.14-73.26).


5.5.4 Confidence intervals for counts

Can be used for continuous variables that are counts.

Function

Basic command
ci means varlist, poisson
Useful options
ci means varlist, poisson level(#)
Explanations
varlist
Insert the name(s) of the variable(s) that you want to use
level(#)
Specify the confidence level. Default is 95.
More information
Commandhelp ci

Practical example

Dataset
StataData1.dta
Variable nameVariable labelValues and labels
unemp_42Days in unemployment (Age 42, Year 2012)

ci means unemp_42, poisson

In this example, the mean is 17.53 days in unemployment. The 95% confidence interval is 17.45-17.61 days.


5.5.5 Confidence intervals for proportions

Can be used for categorical variables.

Function

Basic command
proportions varlist
Useful options
proportions varlist, level(#)
Explanations
varlist
Insert the name(s) of the variable(s) that you want to use
level(#)
Specify the confidence level. Default is 95.
More information
Commandhelp proportions

Practical example

Dataset
StataData1.dta
Variable nameVariable labelValues and labels
educEducational level (Age 40, Year 2010)

proportion educ

Here, we get the proportions (which can be translated into percentages) and its confidence interval for the three categories of educ.

In this example, 19.2% have compulsory education (95% CI: 18.4-20.0), 44.2% have upper secondary education (95% CI: 43.2-45.3%), and 36.6% have university education (95% CI: 35.6-37.6%).