Table of Contents
Fetching ...

The Load Management Paradox: Correcting the Healthy-Worker Survivor Effect in NBA Injury Modeling

Yue Yu, Guanyu Hu

Abstract

In professional sports analytics, evaluating the relationship between accumulated workload and injury risk is a central objective. However, naive survival models applied to NBA game-log data consistently yield a paradox: players who recently logged heavy minutes appear less likely to sustain an injury. We demonstrate that this counterintuitive result is an artifact of the healthy-worker survivor effect, wherein conditioning on game participation induces severe collider bias driven by unobserved latent fitness. To address this structural confounding, we develop a Marginal Structural Piecewise Exponential Model (MS-PEM) that unifies inverse probability of treatment weighting (IPTW) with flexible piecewise-exponential additive models and weighted cumulative exposure (WCE). A simulation study confirms that this selection mechanism is mathematically sufficient to entirely reverse the sign of the true association between workload and injury. Applying the MS-PEM to 78,594 player-game observations across three NBA seasons (encompassing 771 players and 2,439 injury events), we find that adjusting for observed selection reliably shifts the hazard back toward the underlying physiological relationship. While the exact magnitude of the correction is sensitive to outcome-model regularization (attenuating the paradoxical weight function by 1% to 2% under conservative cross-validation and up to 63% to 78% under lighter penalization), the positive direction of the causal correction is highly robust across multiple propensity specifications and doubly robust checks. Ultimately, these results provide a methodological template for bias-aware sports injury modeling, while cautioning that models relying strictly on observational game logs will systematically underestimate the true risk of heavy workloads without richer physiological data for full causal identification.

The Load Management Paradox: Correcting the Healthy-Worker Survivor Effect in NBA Injury Modeling

Abstract

In professional sports analytics, evaluating the relationship between accumulated workload and injury risk is a central objective. However, naive survival models applied to NBA game-log data consistently yield a paradox: players who recently logged heavy minutes appear less likely to sustain an injury. We demonstrate that this counterintuitive result is an artifact of the healthy-worker survivor effect, wherein conditioning on game participation induces severe collider bias driven by unobserved latent fitness. To address this structural confounding, we develop a Marginal Structural Piecewise Exponential Model (MS-PEM) that unifies inverse probability of treatment weighting (IPTW) with flexible piecewise-exponential additive models and weighted cumulative exposure (WCE). A simulation study confirms that this selection mechanism is mathematically sufficient to entirely reverse the sign of the true association between workload and injury. Applying the MS-PEM to 78,594 player-game observations across three NBA seasons (encompassing 771 players and 2,439 injury events), we find that adjusting for observed selection reliably shifts the hazard back toward the underlying physiological relationship. While the exact magnitude of the correction is sensitive to outcome-model regularization (attenuating the paradoxical weight function by 1% to 2% under conservative cross-validation and up to 63% to 78% under lighter penalization), the positive direction of the causal correction is highly robust across multiple propensity specifications and doubly robust checks. Ultimately, these results provide a methodological template for bias-aware sports injury modeling, while cautioning that models relying strictly on observational game logs will systematically underestimate the true risk of heavy workloads without richer physiological data for full causal identification.

Paper Structure

This paper contains 55 sections, 11 equations, 23 figures, 7 tables.

Figures (23)

  • Figure 1: Kaplan--Meier survival curves by game-gap category. Back-to-back games show paradoxically higher survival, consistent with healthy-worker selection rather than a protective effect of compressed schedules.
  • Figure 2: Single-timepoint causal DAG for the workload--injury relationship, showing the full set of observed and unobserved variables. The "Playing" node is a collider: conditioning on it opens a non-causal path between workload and injury through latent fitness.
  • Figure 3: Longitudinal causal DAG for the NBA workload--injury relationship across two game periods. $L_t$: covariates/workload; $U_t$: latent fitness (unobserved); $A_t$: game participation (collider); $Y_t$: injury. Conditioning on $A_t = 1$ opens the non-causal path $L_t \to A_t \leftarrow U_t \to Y_t$, creating the load management paradox.
  • Figure 4: Mathematical architecture of the MS-PEM framework. Counting-process game-log data feed three interacting components: (1) a logistic selection model that estimates game-participation probabilities $\widehat{\pi}_{it}$ and produces stabilized inverse-probability weights $\widehat{SW}_{it}=\bar{\pi}/\widehat{\pi}_{it}$, creating a pseudo-population free of observed selection bias; (2) a piecewise-exponential additive model (PAMM) with B-spline smooth terms $f_0, f_1, f_2$ for the baseline hazard, rest-days effect, and their interaction on interval-censored Poisson data; and (3) a weighted cumulative exposure (WCE) module that summarizes the lagged effect of past minutes via a smooth weight function $w(\ell)=\sum_k \gamma_k B_k(\ell)$ over $L=10$ game lags. The three components are unified in the MS-PEM objective: the IPW-reweighted penalized Poisson log-likelihood with ridge penalty $\alpha$, yielding partially bias-corrected estimates of the workload--injury relationship.
  • Figure 5: The MS-PEM data-processing pipeline. Game-log data are converted to counting-process format, then processed along two paths: a causal correction path (selection model $\to$ stabilized IPW weights) and a survival modeling path (PED transformation $\to$ PAMM + WCE). Both paths combine in the MS-PEM, which fits the IPW-weighted PAMM+WCE to produce partially corrected estimates.
  • ...and 18 more figures