Integrating Neural Differential Forecasting with Safe Reinforcement Learning for Blood Glucose Regulation
Yushen Liu, Yanfu Zhang, Xugui Zhou
TL;DR
The paper tackles safe, personalized automated insulin dosing for Type 1 Diabetes under meal uncertainty. It introduces TSODE, a safety-aware controller that combines Thompson Sampling reinforcement learning with a NeuralODE forecaster and a conformal safety layer to predict short-term glucose trajectories and certify each action. In the UVa/Padova simulator, TSODE achieves 87.9% time-in-range with low hypoglycemia and demonstrates strong generalization to unseen patients. Overall, the work shows that probabilistic safety filtering integrated with adaptive dosing can deliver robust, interpretable closed-loop glucose control with meaningful clinical impact.
Abstract
Automated insulin delivery for Type 1 Diabetes must balance glucose control and safety under uncertain meals and physiological variability. While reinforcement learning (RL) enables adaptive personalization, existing approaches struggle to simultaneously guarantee safety, leaving a gap in achieving both personalized and risk-aware glucose control, such as overdosing before meals or stacking corrections. To bridge this gap, we propose TSODE, a safety-aware controller that integrates Thompson Sampling RL with a Neural Ordinary Differential Equation (NeuralODE) forecaster to address this challenge. Specifically, the NeuralODE predicts short-term glucose trajectories conditioned on proposed insulin doses, while a conformal calibration layer quantifies predictive uncertainty to reject or scale risky actions. In the FDA-approved UVa/Padova simulator (adult cohort), TSODE achieved 87.9% time-in-range with less than 10% time below 70 mg/dL, outperforming relevant baselines. These results demonstrate that integrating adaptive RL with calibrated NeuralODE forecasting enables interpretable, safe, and robust glucose regulation.
