Dynamic feature selection in medical predictive monitoring by reinforcement learning
Yutong Chen, Jiandong Gao, Ji Wu
TL;DR
This work tackles dynamic, cost-aware feature selection for multivariate time-series in clinical monitoring by formulating it as an offline reinforcement learning problem with a POMDP-like structure. It introduces a predictor P_φ and an actor π_θ that choose time-varying feature updates under a budget, balancing prediction accuracy (via a normalized prediction reward) and acquisition costs (via a dynamic cost reward), and it trains the policy with PPO while iteratively updating a non-differentiable predictor using synthesized states. Evaluations on the MIMIC-IV dataset across P/F ratio prediction (regression) and ventilation termination (classification) tasks show the method matches baseline performance without cost constraints and outperforms strong baselines under strict cost limits, with interpretable, time-varying feature importance. The approach supports non-differentiable predictors, reveals actionable feature-importance dynamics over time, and highlights potential for reducing unnecessary testing while maintaining predictive performance in real-world ICU monitoring, albeit with lower sample efficiency and offline-training limitations to address in future work.
Abstract
In this paper, we investigate dynamic feature selection within multivariate time-series scenario, a common occurrence in clinical prediction monitoring where each feature corresponds to a bio-test result. Many existing feature selection methods fall short in effectively leveraging time-series information, primarily because they are designed for static data. Our approach addresses this limitation by enabling the selection of time-varying feature subsets for each patient. Specifically, we employ reinforcement learning to optimize a policy under maximum cost restrictions. The prediction model is subsequently updated using synthetic data generated by trained policy. Our method can seamlessly integrate with non-differentiable prediction models. We conducted experiments on a sizable clinical dataset encompassing regression and classification tasks. The results demonstrate that our approach outperforms strong feature selection baselines, particularly when subjected to stringent cost limitations. Code will be released once paper is accepted.
