Scatterplot

Written by:

Ylva B Almquist

Quick facts

Number of variables
Two

Scales of variable(s)
Continuous

When we had two categorical variables, we could produce a crosstable to see how these two variables were related. If we have two continuous variables, we may use something called a scatterplot instead. Each dot in the scatterplot represents one individual in our data. We may also include a reference line here, to see if we have a pattern in our data (this will be discussed later).

The scatterplot can thus be used to illustrate how two continuous variables co-vary – or “correlate” – in their pattern of values. If increasing values of one variable correspond to increasing values of another variable, it is called a positive correlation. If increasing values of one variable correspond to decreasing values of another variable, we have a negative correlation. In the graph below, different types of correlation are presented. The letter “x” stands for x-axis (horizontal axis) and the letter “y” stands for y-axis (vertical axis).

Note
While not addressed here, patterns can of course also be non-linear (in contrast to the positive and negative correlations shown in the graphs above).

Function

Basic command

graph twoway scatter yvar xvar

Useful options

graph twoway (scatter yvar xvar) (lfit yvar xvar) 
graph twoway (scatter yvar xvar) (lfitci yvar xvar

Explanations
`yvar`	Insert the name of the first variable you want to use. This variable will be chosen for the y-axis (vertical axis).
`xvar`	Insert the name of the first variable you want to use. This variable will be chosen for the x-axis (horizontal axis).
`lfit`	Fit a regression line.
`lfitci`	Fit a regression line and include confidence intervals.

More information
help scatter

Practical example

Dataset

StataData1.dta

Variable name	gpa
Variable label	Grade point average (Age 15, Year 1985)
Value labels	N/A

Variable name	cognitive
Variable label	Cognitive test score (Age 15, Year 1985)
Value labels	N/A

graph twoway (scatter gpa cognitive) (lfitci gpa cognitive)

In the scatterplot above, we display gpa on the y-axis (vertical axis) and cognitive on the x-axis (horizontal axis). We can see a quite clear positive correlation here: the higher the cognitive test scores, the higher the grade point average. This is also illustrated by the fitted regression line.

Note
You can use the Graph Editor (see Graphs) to further edit the scatterplot.