Skip to content

A GUIDE TO APPLIED STATISTICS WITH STATA

A comparison between study designs

Written by:

Ylva B Almquist

Note
Feel free to click on the picture to expand it.

a See X, y, and z for a discussion about exposures and outcomes.
b See Hypothesis testing for information about hypothesis testing.
c See Z: confounding, mediating and moderating variables for a discussion about confounding.
d See A note on causal inference for a discussion about causality.

  • Welcome!
  • Contributions
  • Contents
  • Versions, datasets, and citations
  • Advice
  • Search
  • PART I: THE BASIC STUFF
  • The Stata environment
    • File types
      • Dataset
      • Do-file
      • Log
      • Graph
      • Package
    • Creating a new dataset
      • From questionnaire to dataset
      • Variable structure
      • Manage variables
      • Coding the questionnaires
    • Adjusting an existing dataset
      • Review dataset
      • Convert variables
      • Rename variables
      • Delete variables
      • Sort dataset
      • Create an id number variable
      • Order variables
    • Generate
      • Copy of an existing variable
      • New variable with a specific value
      • New variable based on an expression
      • Rounding
      • Logarithmic transformation
      • Substring
      • Date variables
    • Egen
      • Standardization: z-scores
    • Recode
      • Recode numeric variables
      • Recode string variables
    • Condition the data with if
      • Descriptive statistics with if
      • Recode with if
    • By
    • Combining datasets
      • Merge
      • Append
  • Basic statistical concepts
    • Study design
      • Experimental design
      • Observational design
      • A comparison between study designs
    • Population and sampling
      • Population
      • Sampling
      • Missing data: attrition and non-response
    • Measurement scales
      • Types of scales
      • Differences between scales
      • Types of values
    • Distributions
      • Probability distributions
      • Empirical distributions
  • Descriptive analysis
    • Introduction
    • Frequency table
    • Bar chart
    • Pie chart
    • Histogram
    • Measures of central tendency and variation
      • Central tendency
      • Variation
      • Summarize
      • Tabstat
    • Epidemiological measures
      • Ratios, proportions, and rates
      • Morbidity
      • Mortality
      • Natality
      • Risks and odds
      • Attributable proportion
    • Designing descriptive tables and figures
      • Tables
      • Figures
  • Statistical significance
    • Hypothesis testing
      • Hypotheses
      • Outcomes
      • Errors
      • Statistical hypothesis testing
    • P-values
      • Significance levels and confidence levels
      • Practical importance
    • Confidence intervals
      • The “unknown population parameter”
      • Limits and levels
      • Confidence and precision
    • Choice between p-values and confidence intervals
    • Calculate confidence intervals for descriptive statistics
      • Confidence intervals for means
      • Confidence intervals for median
      • Confidence intervals for variances and standard deviations
      • Confidence intervals for counts
      • Confidence intervals for proportions
    • Power analysis
  • Compare groups
    • Descriptives
      • Box plot
      • Crosstable
    • T-test: Independent samples
      • Non-parametric alternative: Mann-Whitney u-test 
    • T-test: Paired samples
      • Non-parametric alternative: Wilcoxon signed rank test
    • One-way ANOVA
      • Non-parametric alternative: Kruskal-Wallis ANOVA
    • Chi-square test
  • Correlation analysis
    • Descriptives
      • Scatterplot
    • Correlation analysis
    • Non-parametric alternatives: Spearman’s rank correlation and Kendall’s rank correlation
  • PART II: REGRESSION ANALYSIS
  • X, y, and z
    • Introduction
    • X and y
    • Z: confounding, mediating and moderating variables
      • Confounding variables
      • Mediating variables
      • Moderating (or effect modifying) variables
    • A note on causal inference
  • (M)AN(C)OVA
    • ANCOVA
    • MANOVA
    • MANCOVA
  • Preparations for regression analysis
    • What type of regression should be used?
    • Dummies
      • Dummy variables
      • Factor variables
      • A note on the choice of reference category
    • Analytical strategy
    • Missing data
      • How to deal with missing data?
    • From study sample to analytical sample
      • The “pop” variable
    • Imputation
  • Linear regression
    • Introduction
      • Linear regression in short
    • Function
    • Simple linear regression
      • Simple linear regression with a continuous x
      • Simple linear regression with a binary x
      • Simple linear regression with a categorical (non-binary) x
    • Multiple linear regression
    • Model diagnostics
      • Link test
      • Residual plot
      • Breusch-Pagan/Cook-Weisberg test
      • Density plot, normal probability plot, and normal quantile plot
      • Variance inflation factor and correlation matrix
  • Logistic regression
    • Introduction
      • Logistic regression in short
    • Function
    • Simple logistic regression
      • Simple logistic regression with a continuous x
      • Simple logistic regression with a binary x
      • Simple logistic regression with a categorical (non-binary) x
    • Multiple logistic regression
    • Model diagnostics
      • Link test
      • Box-Tidwell and exponential regression models
      • Deviance and leverage
      • Correlation matrix
      • The Hosmer and Lemeshow test
      • ROC curve 
    • Linear probability modelling
  • Ordinal regression
    • Introduction
      • Ordinal regression in short
    • Function
    • Simple ordinal regression
      • Simple ordinal regression with a continuous x
      • Simple ordinal regression with a binary x
      • Simple ordinal regression with a categorical (non-binary) x
    • Multiple ordinal regression
    • Model diagnostics
      • Link test
      • Correlation matrix
      • Brant test
  • Multinomial regression
    • Introduction
      • Multinomial regression in short
    • Function
    • Simple multinomial regression
      • Simple multinomial regression with a continuous x
      • Simple multinomial regression with a binary x
      • Simple multinomial regression with a categorical (non-binary) x
    • Multiple multinomial regression
      • Alternative base outcomes
    • Model diagnostics
      • Assess model fit
      • Correlation matrix
  • Poisson regression
    • Introduction
      • Poisson regression in short
    • Function
    • Simple Poisson regression
      • Simple Poisson regression with a continuous x
      • Simple Poisson regression with a binary x
      • Simple Poisson regression with a categorical (non-binary) x
    • Multiple Poisson regression
    • Model diagnostics
      • Link test
      • Correlation matrix
      • Deviance goodness-of-fit test and Pearson goodness-of-fit test
    • Alternatives to Poisson regression
      • Negative binomial regression model
      • Zero-inflated Poisson regression
      • Compare fit of alternative count models
    • Hurdle regression
  • Cox regression
    • Introduction
      • Observational time and censoring
      • Survival function
      • Hazard function
      • Tied failure times
      • Non-parametric, parametric, and semi-parametric models
    • The Cox regression model
      • Cox regression in short
    • Declare that the data are time-to-event data
    • Descriptive analysis
      • Kaplan-Meier curves
      • Nelson-Aalen cumulative hazard function
    • Function
    • Simple Cox regression
      • Simple Cox regression with a continuous x
      • Simple Cox regression with a binary x
      • Simple Cox regression with a categorical (non-binary) x
    • Multiple Cox regression
    • Model diagnostics
      • Link test
      • Correlation matrix
      • Log-log plot of survival
      • Kaplan-Meier and predicted survival plot
      • Schoenfeld residuals
      • Tied failure times – cox
    • Laplace regression
  • Mediation analysis
    • Introduction
      • Type of regression analysis 
      • Rescaling bias 
    • Function
      • Practical example with logistic regression
      • Practical example with ordinal regression
  • Interaction analysis
    • Introduction
      • Type of regression analysis
      • Primary approaches to interaction analysis
      • Two ways of generating the interaction term
      • Interpretation
    • Approach A
      • Practical example with linear regression
      • Practical example with logistic regression
    • Approach B
      • Practical example with logistic regression
      • Practical example with Cox regression
  • PART III: TAKING IT ONE STEP FURTHER
  • Factor analysis
    • Introduction
    • Assumptions
    • Number of factors
    • Factor loadings
    • Rotation
    • Postestimation
    • Factor analysis vs principal component analysis
    • A practical example
    • Cronbach’s alpha
  • Latent class analysis
  • Structural equation modelling
  • Group-based trajectory modelling
  • Sequence analysis
  • Time-series analysis
  • Difference-in-differences
  • PART IV: TEST YOUR SKILLS
  • Data management and description
    • Stata and basic concepts
    • Descriptive analysis
  • Basic statistical analysis
    • Statistical significance
    • Differences and associations
  • Statistical data modelling
    • Linear regression
    • Logistic regression
  • PART V: FROM START TO FINISH
  • Practical example with linear regression
    • Aim and research questions
    • Data and methods
      • Data material
      • Variables
      • Statistical analysis
        • Simple linear regression analyses
        • Multiple linear regression analysis
        • Interaction analysis
        • Model diagnostics
    • Results
    • Discussion
  • Practical example with logistic regression
    • Aim and research question
    • Data and methods
      • Variables
      • Statistical analysis
        • Simple logistic regression
        • Multiple logistic regression
        • Interaction analysis
        • Model diagnostics
    • Results
    • Discussion
  • Practical example with Cox regression
    • Aim and research questions
    • Data and methods
      • Variables
      • Descriptive analysis
      • Statistical analysis
        • Simple and multiple Cox regression
        • Interaction analysis
        • Model diagnostics
    • Results
    • Discussion
Powered by WordPress.com. A GUIDE TO APPLIED STATISTICS WITH STATA
 

Loading Comments...