Variation

Besides the mean, the median, and the mode, we may use some measures of variation to describe our variables further.

Here are some of the most common measures of variation:

MinimumThe lowest value
MaximumThe highest value
VarianceThe average of squared deviations from the mean value
Standard deviationThe squared root of the variance

These measures are most suitable for continuous variables (i.e. ratio or interval) but sometimes minimum and maximum are used for ordinal variables as well. However, they cannot be used for nominal variables (for the same reason as why we do not use mean or median to describe nominal variables).

The minimum and maximum are rather self-explanatory, but what about variance and standard deviation?

Below, these measures are discussed in more detail.

Variance and standard deviation

Both variance and standard deviation are measured used to describe the dispersion (spread) of data around the mean value of a variable.

To calculate the variance, we do the following:

1. Calculate the mean of the variable (the sum of all values, divided by the number of observations).
2. Subtract the mean from each value. These differences are often called deviations. Values below the mean will have negative deviations whereas values above the mean will be positive deviations.
3. Square each deviation to make it positive.
4. Add the squared deviations together. Divide by the number of observations.  

However, the variance is quite difficult to interpret. That is why most would prefer to express dispersion in terms of standard deviation instead. To do this, we just add one more step:

5. Take the square root of the variance. 

Population vs sample

The above calculations are based on the idea that the data we use encompass the entire population that we want to study. This is perhaps seldom the case; often we have drawn a sample from our population.

Under such circumstances, we need to make a small adjustment (in italics):

1. Calculate the mean of the variable (the sum of all values, divided by the number of observations).
2. Subtract the mean from each value. These differences are often called deviations. Values below the mean will have negative deviations whereas values above the mean will be positive deviations.
3. Square each deviation to make it positive.
4. Add the squared deviations together. Divide by the number of observations minus 1.  
5. Take the square root of the variance. 

We will not go into detail regarding why we adjust Step 4, as described above.

But basically, it has to do with the distinction between parameters and statistics: for populations, we can calculate parameters (fixed, “true” value), whereas for samples, we can calculate statistics (dependent on the selected sample, estimated value). We use “observations minus 1” (usually expressed as “n-1”) to produce as a less biased estimate.

Want to know more? Read up on Bessel’s correction.