Introduction

We talk a lot about variables in this guide, because variables are the cornerstones of quantitative data materials and quantitative data analysis. Other terms are sometimes used instead of “variables” – such as “indicators”, “measures” or “items”.

A variable is supposedly capturing the concept that we are interested in. The process of “translating” a concept to a variable is often called operationalisation. Some concepts are rather vague and not particularly easy to operationalise. One such example is “health”: should it be assessed by administrative health records or self-reported information? Is it simply the absence of disease or something more than that? Concepts such as “income” are more concrete since it refers to units (money) that can quite easily be measured. Still, it can be operationalised in many ways: monthly or annual income; income before or after taxes; individual income or household income; and so on. Operationalisation should always be carefully reflected on and clearly motivated in research, since it might have important consequences for the analysis and therefore for the interpretation of the results.

Operationalisation should always be carefully reflected on and clearly motivated in research, since it might have important consequences for the analysis and therefore for the interpretation of the results.

Associations

In many types of analysis – such as regression analysis – we are interested in the association between two (or more) variables. The term association (or relationship) reflects the hypothesis that the variables are linked to one another in some way.

Effects

The way that regression analysis is constructed, however, assumes that one variable one variable has an “effect” on another variable. Here, we are talking about statistical effect, not causal effect. In other words, while we may find that one of the variables has a statistical effect on the other variable, it does not mean that we have proved that the first variable causes the second variable. A phrase commonly used in statistics to reflect this is: “correlation does not imply causation”. Just note that while it is more correct to talk about statistical effects, it is not all that uncommon that there are either implicit or explicit ideas about causal effects, guided by previous studies and theories. Sometimes such assumptions are quite reasonable, but the extent to which we can be certain about making causal inferences depends on the study design.

We will come back to the issue of causality later in this chapter (see A note on causal inference).

X, y, and z

Variables play different roles in analysis. Researchers often use various terms to distinguish between these roles. Here, we will try to shed some light on the terms that are used.

Variables

xIndependent variable; Exposure; Predictor
yDependent variable; Outcome
zCovariate; Confounder; Mediator; Moderator; Effect modifier