With the command linktest, we can assess whether our model is correctly specified. This test uses the linear predicted value (called _hat) and the linear predicted value squared (_hatsq) to rebuild the model. We expect _hat to be statistically significant, and _hatsq to be statistically non-significant. If one or both of these expectations are not met, the model is mis-specified.
However, do not rely too much on this test – remember that you should also use theory and common sense to guide your decisions. It is very seldom relevant to focus on this test if our ambition is to investigate associations (and not to make the best possible prediction of the outcome).
More information |
Practical example
We perform this test for the full model, so let us go back to the example from the multiple regression analysis. The quietly option is included in the beginning of the command to suppress the output.
quietly logistic earlyret bmi sex ib1.educ if pop_logistic==1 |
And then we run the test:
linktest |

Since the p-value for the variable _hat is above 0.05 and the p-value for _hatsq is below 0.05, it means that our model is completely mis-specified. This was not surprising, given our problems with the multiple regression analysis earlier.
We could try to amend this by transforming any of the included variables (e.g. through categorisation, or log transformation), excluding any of the included variables, or adding more variables to the model (other x-variables or e.g. interactions between the included variables).
Of course, this should be explored before we continue to assess model fit – but for the sake of simplicity, we will ignore this problem in the following sections.