Table of Contents
Fetching ...

A Simulation Study to Compare Inferential Properties when Modelling Ordinal Outcomes: The Case for the (Plain but Robust) Proportional Odds Model

Stefan Inerle, Markus Pauly, Moritz Berger

Abstract

Ordinal measurements are common outcomes in studies within psychology, as well as in the social and behavioral sciences. Choosing an appropriate regression model for analysing such data poses a difficult task. This paper aims to facilitate modeling decisions for quantitative researchers by presenting the results of an extensive simulation study on the inferential properties of common ordinal regression models: the proportional odds model, the category-specific odds model, the location-shift model, the location-scale model, and the linear model, which incorrectly treats ordinal outcomes as metric. The simulations were conducted under different data generating processes based on each of the ordinal models and varying parameter configurations within each model class. We examined the bias of parameter estimates as well as type I error rates ($α$-errors) and the power of statistical parameter testing procedures corresponding to the respective models. Our findings reveal several highlights. For parameter estimates, we observed that cumulative ordinal regression models exhibited large biases in cases of large parameter values and high skewness of the outcome distribution in the true data generation process. Regarding statistical hypothesis testing, the proportional odds model and the linear model showed the most reliable results. Due to its better fit and interpretability for ordinal outcomes, we recommend the use of the proportional odds model unless there are relevant contraindications.

A Simulation Study to Compare Inferential Properties when Modelling Ordinal Outcomes: The Case for the (Plain but Robust) Proportional Odds Model

Abstract

Ordinal measurements are common outcomes in studies within psychology, as well as in the social and behavioral sciences. Choosing an appropriate regression model for analysing such data poses a difficult task. This paper aims to facilitate modeling decisions for quantitative researchers by presenting the results of an extensive simulation study on the inferential properties of common ordinal regression models: the proportional odds model, the category-specific odds model, the location-shift model, the location-scale model, and the linear model, which incorrectly treats ordinal outcomes as metric. The simulations were conducted under different data generating processes based on each of the ordinal models and varying parameter configurations within each model class. We examined the bias of parameter estimates as well as type I error rates (-errors) and the power of statistical parameter testing procedures corresponding to the respective models. Our findings reveal several highlights. For parameter estimates, we observed that cumulative ordinal regression models exhibited large biases in cases of large parameter values and high skewness of the outcome distribution in the true data generation process. Regarding statistical hypothesis testing, the proportional odds model and the linear model showed the most reliable results. Due to its better fit and interpretability for ordinal outcomes, we recommend the use of the proportional odds model unless there are relevant contraindications.
Paper Structure (34 sections, 8 equations, 22 figures, 1 table)

This paper contains 34 sections, 8 equations, 22 figures, 1 table.

Figures (22)

  • Figure 1: Probability distribution of the outcome variable for seven categories and no effect of the covariates when setting $\boldsymbol{\theta} = (-2.74, -1.96, -1.45, -1.00, -0.50, 0.29)^T$ (skewed setting).
  • Figure 2: Influence of simulation settings on the biases when drawing with the proportional odds model aggregated across all $n$, $k$, and all distribution settings. Except in their specific columns, boxplots were aggregated across all numbers of covariates, number of informative variables, and true location parameter values.
  • Figure 3: Influence of simulation settings on the dispersion biases when drawing with the respective model, including dispersion aggregated across all $n$, number of covariates, and all distribution settings. Except in their specific columns, boxplots were aggregated across all numbers of categories $k$, number of informative variables, and true location parameter values, and true dispersion parameter values.
  • Figure 4: Relationship between the biases of the location parameter and the dispersion parameter in the location-shift and the location-scale model.
  • Figure 5: $\alpha$-error of the models when drawing with the PO model. The chosen base model has $p = 5$ covariates, one informative covariate, $k = 3$ categories, and the uniform distribution setting.
  • ...and 17 more figures