Table of Contents
Fetching ...

Adaptive Test-Time Training for Predicting Need for Invasive Mechanical Ventilation in Multi-Center Cohorts

Xiaolei Lu, Shamim Nemati

TL;DR

Cross-site distribution shifts in ICU EHR data undermine IMV-need prediction. AdaTTT addresses this with adaptive test-time training that combines dynamic self-supervised learning and prototype-guided Partial Optimal Transport to align test-time representations with source structure. Information-theoretic analysis links auxiliary-task alignment to main-task performance, and experiments across multi-center ICU cohorts show AdaTTT achieving top discrimination and improved calibration versus strong baselines, supporting its potential for real-time clinical risk monitoring. Limitations include potential sensitivity to extreme shifts and added test-time computation, motivating future work on efficiency and broader clinical validation.

Abstract

Accurate prediction of the need for invasive mechanical ventilation (IMV) in intensive care units (ICUs) patients is crucial for timely interventions and resource allocation. However, variability in patient populations, clinical practices, and electronic health record (EHR) systems across institutions introduces domain shifts that degrade the generalization performance of predictive models during deployment. Test-Time Training (TTT) has emerged as a promising approach to mitigate such shifts by adapting models dynamically during inference without requiring labeled target-domain data. In this work, we introduce Adaptive Test-Time Training (AdaTTT), an enhanced TTT framework tailored for EHR-based IMV prediction in ICU settings. We begin by deriving information-theoretic bounds on the test-time prediction error and demonstrate that it is constrained by the uncertainty between the main and auxiliary tasks. To enhance their alignment, we introduce a self-supervised learning framework with pretext tasks: reconstruction and masked feature modeling optimized through a dynamic masking strategy that emphasizes features critical to the main task. Additionally, to improve robustness against domain shifts, we incorporate prototype learning and employ Partial Optimal Transport (POT) for flexible, partial feature alignment while maintaining clinically meaningful patient representations. Experiments across multi-center ICU cohorts demonstrate competitive classification performance on different test-time adaptation benchmarks.

Adaptive Test-Time Training for Predicting Need for Invasive Mechanical Ventilation in Multi-Center Cohorts

TL;DR

Cross-site distribution shifts in ICU EHR data undermine IMV-need prediction. AdaTTT addresses this with adaptive test-time training that combines dynamic self-supervised learning and prototype-guided Partial Optimal Transport to align test-time representations with source structure. Information-theoretic analysis links auxiliary-task alignment to main-task performance, and experiments across multi-center ICU cohorts show AdaTTT achieving top discrimination and improved calibration versus strong baselines, supporting its potential for real-time clinical risk monitoring. Limitations include potential sensitivity to extreme shifts and added test-time computation, motivating future work on efficiency and broader clinical validation.

Abstract

Accurate prediction of the need for invasive mechanical ventilation (IMV) in intensive care units (ICUs) patients is crucial for timely interventions and resource allocation. However, variability in patient populations, clinical practices, and electronic health record (EHR) systems across institutions introduces domain shifts that degrade the generalization performance of predictive models during deployment. Test-Time Training (TTT) has emerged as a promising approach to mitigate such shifts by adapting models dynamically during inference without requiring labeled target-domain data. In this work, we introduce Adaptive Test-Time Training (AdaTTT), an enhanced TTT framework tailored for EHR-based IMV prediction in ICU settings. We begin by deriving information-theoretic bounds on the test-time prediction error and demonstrate that it is constrained by the uncertainty between the main and auxiliary tasks. To enhance their alignment, we introduce a self-supervised learning framework with pretext tasks: reconstruction and masked feature modeling optimized through a dynamic masking strategy that emphasizes features critical to the main task. Additionally, to improve robustness against domain shifts, we incorporate prototype learning and employ Partial Optimal Transport (POT) for flexible, partial feature alignment while maintaining clinically meaningful patient representations. Experiments across multi-center ICU cohorts demonstrate competitive classification performance on different test-time adaptation benchmarks.

Paper Structure

This paper contains 34 sections, 2 theorems, 41 equations, 8 figures, 7 tables.

Key Result

Lemma 1

In test-time training, where only the shared representation layers are updated using $Y_s'$, the following inequality holds:

Figures (8)

  • Figure 1: Risk score evolution during test-time training for a patient from Site A. Risk increases as intubation nears, which reflects model adaptation.
  • Figure 2: Feature importance evolution during training. The heatmap shows the changes in feature importance in the initial epochs and final epochs.
  • Figure 3: An example of feature importance evolution during test-time training. The heatmap shows the changes in feature importance across different iterations.
  • Figure 4: Evaluation of the number of gradient updates for test-time training on different test cohorts.
  • Figure 5: Cumulative AUC trend over an increasing number of patients.
  • ...and 3 more figures

Theorems & Definitions (2)

  • Lemma 1
  • Theorem 2