Developing Fairness-Aware Task Decomposition to Improve Equity in Post-Spinal Fusion Complication Prediction

Yining Yuan; J. Ben Tamo; Wenqi Shi; Yishan Zhong; Micky C. Nnamdi; B. Randall Brenn; Steven W. Hwang; May D. Wang

Developing Fairness-Aware Task Decomposition to Improve Equity in Post-Spinal Fusion Complication Prediction

Yining Yuan, J. Ben Tamo, Wenqi Shi, Yishan Zhong, Micky C. Nnamdi, B. Randall Brenn, Steven W. Hwang, May D. Wang

TL;DR

The paper tackles fairness in postoperative complication prediction for spinal fusion by introducing FAIR-MTL, a fairness-aware multitask learning framework that discovers latent subgroups via unsupervised demographic embedding and routes predictions through subgroup-specific heads. This end-to-end approach combines inverse-frequency weighting and regularization to mitigate subgroup disparities while preserving strong predictive performance, validated on large clinical datasets and externally on INSPIRE. It provides interpretable AI outputs through SHAP and feature-importance analyses and demonstrates that subgroup-aware learning reduces demographic parity and equalized-odds gaps without sacrificing accuracy. The work advances clinically actionable, equitable risk stratification in spine surgery with robust external validation and ablation evidence underscoring the importance of each architectural component. Overall, FAIR-MTL offers a practical path toward fairer, more transparent surgical risk prediction systems with potential for broader clinical deployment.

Abstract

Fairness in clinical prediction models remains a persistent challenge, particularly in high-stakes applications such as spinal fusion surgery for scoliosis, where patient outcomes exhibit substantial heterogeneity. Many existing fairness approaches rely on coarse demographic adjustments or post-hoc corrections, which fail to capture the latent structure of clinical populations and may unintentionally reinforce bias. We propose FAIR-MTL, a fairness-aware multitask learning framework designed to provide equitable and fine-grained prediction of postoperative complication severity. Instead of relying on explicit sensitive attributes during model training, FAIR-MTL employs a data-driven subgroup inference mechanism. We extract a compact demographic embedding, and apply k-means clustering to uncover latent patient subgroups that may be differentially affected by traditional models. These inferred subgroup labels determine task routing within a shared multitask architecture. During training, subgroup imbalance is mitigated through inverse-frequency weighting, and regularization prevents overfitting to smaller groups. Applied to postoperative complication prediction with four severity levels, FAIR-MTL achieves an AUC of 0.86 and an accuracy of 75%, outperforming single-task baselines while substantially reducing bias. For gender, the demographic parity difference decreases to 0.055 and equalized odds to 0.094; for age, these values reduce to 0.056 and 0.148, respectively. Model interpretability is ensured through SHAP and Gini importance analyses, which consistently highlight clinically meaningful predictors such as hemoglobin, hematocrit, and patient weight. Our findings show that incorporating unsupervised subgroup discovery into a multitask framework enables more equitable, interpretable, and clinically actionable predictions for surgical risk stratification.

Developing Fairness-Aware Task Decomposition to Improve Equity in Post-Spinal Fusion Complication Prediction

TL;DR

Abstract

Developing Fairness-Aware Task Decomposition to Improve Equity in Post-Spinal Fusion Complication Prediction

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)