Factor variables

The alternative presented above is quite pedagogical (we think, at least) – but it is quite time consuming to generate dummies. In Stata, there is an easy fix: something called factor variables. When you conduct your regression analysis, you just simply write the prefix “i.” before the name of the categorical variable(s), and Stata will include dummies in the analysis automatically, e.g. “i.educ”.

Base level

When you include factor variables, the lowest value will automatically be chosen as the reference category. This can be altered by specifying another so-called base level. This is done by adding a “b” to the prefix: “ib.” You then also need to specify which category that should be the reference category by adding the value of the category, e.g. “ib3.educ” (which would define “University” as the reference category).

There are also some alternatives to specifying the value of the category. These are the possible so-called base operators:

ib#.Specifies a specific value as the base. #=the value of the category that we want to choose as the reference category.
ib##.Specifies the #th ordered value as the base.
ib(first).Specifies the smallest value as the base. Default.
ib(last).Specifies the largest value as the base.
ib(freq).Specifies the most common value as the base.
ibn.No base level.