Next we produce model diagnostics based on the multiple linear regression analysis with bullied, gradeyear, sex, bornoutswe, famtype, numrelocations, and healthissues.
Note Remember that the analyses should only be based on those with a value of 1 for the pop variable.
Note Remember to use factor variables when including categorical/non-binary variables.
Here, we should examine the following:
Model specification using a link test.
Normality using a density plot, a probability plot, and a quantile plot.
Multicollinearity using the variance inflation factor and a correlation matrix.
Before performing each test for model diagnostics, we need to specify for which model the diagnostics should be performed for. Therefore we need to perform the analysis of our full model before the model diagnostics. We use the “quietly” option in our command to suppress the output.
Here we can see that _hat is statistically significant and hatsq_ is not. This looks good and means that our model is not incorrectly specified. It should also be noted that our research question is to investigate associations, and not exact predictions, which means that it would not necessarily have been a huge problem if the model did not pass the linktest.
In the density plot we can see that the graph over the residuals follow the normal distribution curve relatively well, the probability curve shows a little bit of deviation but is still OK, and the quintile plot shows a hint of an S-figure but not enough to deem it problematic.
We want to see mean VIF <10 and low VIF-values. That is the case here. Then we create a correlation matrix.
estat vce, corr
We do not want to see any high correlations, preferably all of them should be <0.7. This looks good.
Note We do not examine linearity with a residual plot here, as all our x-variables have discrete values.
Summary Based on model diagnostics, we do not see any potential issues with the assumptions behind linear regression analysis.
In your thesis, the results from your model diagnostics may look different and you might see potential problems. What to do about this depends a lot on your data and variables, and we suggest you discuss with your supervisor what actions that should be taken regarding your analyses.
Note For extra review, please re-visit the section on Model diagnostics.
Note When all analyses are completed, remember to save the dataset and do-file!
After performing all of the analyses and model diagnostics, it is time to decide what to include in the result section of our thesis. Let us look at this further in the next section on results.