Reinforced Sequential Decision-Making for Sepsis Treatment: The POSNEGDM Framework with Mortality Classifier and Transformer
Dipesh Tamboli, Jiayu Chen, Kiran Pranesh Jotheeswaran, Denny Yu, Vaneet Aggarwal
TL;DR
Sepsis treatment requires dynamic, patient-specific decisions that static guidelines struggle to provide. The authors introduce POSNEGDM, an offline reinforcement learning framework that combines a Transformer-based DualSight decision maker with a Mortality Classifier serving as a feedback reinforcer, trained on Sepsis MIMIC-III data with 4-hour window discretization. By explicitly leveraging positive and negative demonstrations through a three-term loss $L_{total}=\alpha L_{action}+\beta L_{state}+\gamma L_{survival}$, the system learns actions and predicted states that maximize patient survival. POSNEGDM achieves a striking reduction in mortality (2.61%) and high action-prediction fidelity (94.6%), significantly outperforming baselines such as Decision Transformer and Behavioral Cloning, with ablations confirming the critical roles of the transformer and mortality feedback. These results point to a clinically actionable, mortality-aware decision-support approach that could reduce ICU mortality and healthcare costs in sepsis care.
Abstract
Sepsis, a life-threatening condition triggered by the body's exaggerated response to infection, demands urgent intervention to prevent severe complications. Existing machine learning methods for managing sepsis struggle in offline scenarios, exhibiting suboptimal performance with survival rates below 50%. This paper introduces the POSNEGDM -- ``Reinforcement Learning with Positive and Negative Demonstrations for Sequential Decision-Making" framework utilizing an innovative transformer-based model and a feedback reinforcer to replicate expert actions while considering individual patient characteristics. A mortality classifier with 96.7\% accuracy guides treatment decisions towards positive outcomes. The POSNEGDM framework significantly improves patient survival, saving 97.39% of patients, outperforming established machine learning algorithms (Decision Transformer and Behavioral Cloning) with survival rates of 33.4% and 43.5%, respectively. Additionally, ablation studies underscore the critical role of the transformer-based decision maker and the integration of a mortality classifier in enhancing overall survival rates. In summary, our proposed approach presents a promising avenue for enhancing sepsis treatment outcomes, contributing to improved patient care and reduced healthcare costs.
