A practical example

Function Step 1

Basic command
factor varlist
Useful options
factor varlist, mineigen(number)
factor varlist, pcf or ipf or ml
Explanations
varlistList which variables that you want to include in the analysis.
pcf or ipf or mlSpecify the estimation method. Default is pf.
Short names
pfPrincipal factor method
pcf Principal-component factor method
ipfIterated principal-factor method
mlMaximum-likelihood factor method
Note
Options can be used simultaneously, e.g: factor varlist, mineigen(number) pcf
More information
help factor

Performing a factor analysis can be seen as an iterative process: you conduct the analysis, evaluate it, might tweak it a bit, and then conduct it again. We will start by performing a simple factor analysis with the principal-component factor method (pcf).

Practical example

Dataset
StataData2.dta
Variable nameVariable label
imp_ideas
imp_rich
imp_secure
imp_good
imp_help
imp_success
imp_risk
imp_behave
imp_environ
imp_trad
Important to think up new ideas
Important to be rich
Important living in secure surroundings
Important to have a good time
Important to help people
Important to be successful
Important with adventure and taking risks
Important to always behave properly
Important looking after the environment
Important with tradition

factor imp_ideas-imp_trad, pcf

In the first table, we first look at the column called Eigenvalue. We see that Factor1 and Factor2 produce eigenvalues above 1 (2.98870 and 1.61967, respectively). Next, focusing on the column called Proportion, we see that Factor1 accounts for 30% (0.2989) and Factor2 for (16% (0.1620) of the variance.

In the second table, we get the factor loadings for each item. When we use the option pcf, factor loadings are only shown for factors with eigenvalues above 1. For Factor1, loadings range between 0.4091 and 0.6658. For Factor2, they range between -0.3716 and 0.5807. The uniqueness values range between 0.4809 and 0.6089. Earlier, we suggested that factor loadings between 0.5 and 1 were acceptable, as well as uniqueness values between 0 and 0.5. Thus, our factor solution is quite poor. Moreover, it is not entirely clear which item belongs to which factor – we might need some rotation here.

Function step 2

Basic command
rotate
Useful options
rotate, quartimax
rotate, equamax
rotate, promax(number)
rotate, oblimin(number)
Explanations
quartimaxOrthogonal rotation with the quartimax option.
equamaxOrthogonal rotation with the equamax option.
promax(number)Oblique rotation with the promax option, replace “number” with preferred power (default is 3).
oblimin(number)Oblique rotation with the oblimin option, replace “number” with preferred gamma (default is 0).
Note
Orthogonal rotation with the varimax option is default. To clear the results from rotation, use: rotate, clear
More information
help rotate

The next step is to rotate the results to minimize the complexity of the factor structure and facilitate interpretation. Since it is unlikely that our factors are uncorrelated (they seldom are, in the social sciences), we will go with an oblique rotation (more specifically, we try out promax).

Practical example

Dataset
StataData2.dta
Variable nameVariable label
imp_ideas
imp_rich
imp_secure
imp_good
imp_help
imp_success
imp_risk
imp_behave
imp_environ
imp_trad
Important to think up new ideas
Important to be rich
Important living in secure surroundings
Important to have a good time
Important to help people
Important to be successful
Important with adventure and taking risks
Important to always behave properly
Important looking after the environment
Important with tradition

rotate, promax

The rotation made the factor loadings more clearly reflect the two factors.

If we identify for with factor each item has the higher loading, we can conclude that the two factors contain the following items:

Factor 1

  • Important living in secure surroundings (security)
  • Important to help the people (benevolence)
  • Important to always behave properly (conformity)
  • Important looking after the environment (universalism)
  • Important with tradition (tradition)

Factor 2

  • Important to think up new ideas (self-direction)
  • Important to be rich (power)
  • Important to have a good time (hedonism)
  • Important being very successful (achievement)
  • Important with adventure and taking risks (stimulation)

The ten variables used in this factor analysis actually stem from a theory of human values, developed by Schwartz. According to this theory, the variables should be categorised in the following way:

  • Conservation: security, tradition, and conformity
  • Openness to change: self-direction, stimulation, and hedonism
  • Self-enhancement: power and achievement
  • Self-transcendence: benevolence and universalism

If we compare the theoretical categories with the factors derived from factor analysis, we actually see that the Factor 1 includes all variables theoretically associated with conservation and self-transcendence, whereas Factor 2 includes all variables theoretically associated with openness to change and self-enhancement.

What do we do with this information then? Well, we need to examine possible reasons as to why the factor analysis did not reveal the same factors as the theory proposes. If we find no apparent problems with the empirics (e.g. missing data, problems with the questionnaire itself, etc.) we may suggest that the theory needs to be modified. At least it is important to discuss the differences between the theory and the empirics.

Sometimes, we do not have a clear theory guiding the factor analysis and, thus, we have no a priori understanding about which factors that are reasonable to expect. In that case, it is common practice to focus on a factor solution with good properties (i.e. clear factor structure and high factor loadings). It is always a trade-off between theory and empirics: if theory has precedence over empirics, we may be more disposed to accept lower factor loadings.

In practice, all of this might mean that we go on to create two indices (e.g. sum score, or mean score), with each reflecting one factor, which we can then include in another analysis (such as regression analysis). 

Function step 3

Basic command
estat kmo
screeplot
Explanations
kmoKaiser-Meyer-Olkin measure of sampling adequacy.
screeplotPlot eigenvalues.
Note
Orthogonal rotation with the varimax option is default. To clear the results from rotation, use: rotate, clear
More information
help estat factor

The third step is to do some postestimations, such as looking at the Kaiser-Meyer-Olkin measure of sampling adequacy and a screeplot, to see if our two-factor solution makes sense. 

Note that if we here find any problems with our factor analysis or the chosen number of factors, we should go back and make some adjustments in order to find a better solution. For instance, we can try out different estimation methods, rotate the solution differently, or remove one or several of the items.

Practical example

Dataset
StataData2.dta
Variable nameVariable label
imp_ideas
imp_rich
imp_secure
imp_good
imp_help
imp_success
imp_risk
imp_behave
imp_environ
imp_trad
Important to think up new ideas
Important to be rich
Important living in secure surroundings
Important to have a good time
Important to help people
Important to be successful
Important with adventure and taking risks
Important to always behave properly
Important looking after the environment
Important with tradition

estat kmo

The KMO test produces an overall value of 0.7918, which shows that our factor analysis appears to be appropriate.

screeplot

In the screeplot, we can see that the “elbow” begins with the third factor, thus reflecting that a two-factor solution seems feasible.