Deep Reinforcement Learning for Sepsis Treatment
Aniruddh Raghu, Matthieu Komorowski, Imran Ahmed, Leo Celi, Peter Szolovits, Marzyeh Ghassemi
TL;DR
The paper tackles sepsis treatment by learning treatment policies from ICU data using continuous-state modeling and deep reinforcement learning. It employs a Duelling Double-Deep Q Network with a discretized action space for fluids and vasopressors, trained on MIMIC-III Sepsis-3 data, and uses SOFA and lactate-based rewards with a terminal survival signal. Qualitative analyses reveal policies that align with physician behavior, particularly in vasopressor use, and off-policy evaluation suggests potential improvements over clinician policies while acknowledging evaluation limitations. This work demonstrates the feasibility of continuous-state RL for ICU decision support and outlines directions for per-patient evaluation and model-based extensions.
Abstract
Sepsis is a leading cause of mortality in intensive care units and costs hospitals billions annually. Treating a septic patient is highly challenging, because individual patients respond very differently to medical interventions and there is no universally agreed-upon treatment for sepsis. In this work, we propose an approach to deduce treatment policies for septic patients by using continuous state-space models and deep reinforcement learning. Our model learns clinically interpretable treatment policies, similar in important aspects to the treatment policies of physicians. The learned policies could be used to aid intensive care clinicians in medical decision making and improve the likelihood of patient survival.
