Towards Physiologically Sensible Predictions via the Rule-based Reinforcement Learning Layer
Lingwei Zhu, Zheng Chen, Yukie Nagai, Jimeng Sun
TL;DR
This work addresses physiologically invalid predictions produced by data-driven healthcare models by introducing the Rule-based Reinforcement Learning Layer (RRLL), a lightweight RL layer that learns to reassign labels using a compact set of impossibility rules. RRLL defines an MDP over $K$-class one-hot labels, where states fuse the base predictor's outputs with recent actions and features, and rewards penalize rule-violating transitions. Trained with REINFORCE and an action-change penalty, RRLL can be plugged onto any predictor to yield end-to-end predictions that are more physiologically plausible. Across sleep staging and seizure onset detection, RRLL improves accuracy and dramatically reduces impossible transitions, demonstrating generality across domains and base models. The approach offers a practical path toward deployable, clinically sensible AI systems without requiring extensive domain-specific modeling of disease progression.
Abstract
This paper adds to the growing literature of reinforcement learning (RL) for healthcare by proposing a novel paradigm: augmenting any predictor with Rule-based RL Layer (RRLL) that corrects the model's physiologically impossible predictions. Specifically, RRLL takes as input states predicted labels and outputs corrected labels as actions. The reward of the state-action pair is evaluated by a set of general rules. RRLL is efficient, general and lightweight: it does not require heavy expert knowledge like prior work but only a set of impossible transitions. This set is much smaller than all possible transitions; yet it can effectively reduce physiologically impossible mistakes made by the state-of-the-art predictor models. We verify the utility of RRLL on a variety of important healthcare classification problems and observe significant improvements using the same setup, with only the domain-specific set of impossibility changed. In-depth analysis shows that RRLL indeed improves accuracy by effectively reducing the presence of physiologically impossible predictions.
