Table of Contents
Fetching ...

Deep Reinforcement Learning for Personalized Diagnostic Decision Pathways Using Electronic Health Records: A Comparative Study on Anemia and Systemic Lupus Erythematosus

Lillian Muyama, Antoine Neuraz, Adrien Coulet

TL;DR

This work treats clinical diagnosis as a sequential decision problem and trains DRL agents to generate patient-specific diagnostic pathways from synthetic EHR data for anemia and SLE. By encoding features as acquisition actions and diagnoses as terminal actions, the approach yields explainable step-by-step reasoning and remains robust to noise and missing values, outperforming traditional classifiers under imperfect data in some scenarios. The study shows that dueling DQN with prioritized experience replay achieves strong accuracy while producing concise pathways, and that pathway-based metrics like wPAHM can balance diagnosis quality with pathway efficiency. The results suggest DRL can augment guidelines by learning adaptable, explainable diagnostic processes that generalize to conditions with tree-like or weighted-criteria diagnosis schemas, with future work focusing on real-world EHR validation and multimodal data integration.

Abstract

Background: Clinical diagnosis is typically reached by following a series of steps recommended by guidelines authored by colleges of experts. Accordingly, guidelines play a crucial role in rationalizing clinical decisions but suffer from limitations as they are built to cover the majority of the population and fail at covering patients with uncommon conditions. Moreover, their updates are long and expensive, making them unsuitable for emerging diseases and practices. Methods: Inspired by guidelines, we formulate the task of diagnosis as a sequential decision-making problem and study the use of Deep Reinforcement Learning (DRL) algorithms to learn the optimal sequence of actions to perform in order to obtain a correct diagnosis from Electronic Health Records (EHRs). We apply DRL on synthetic, but realistic EHRs and develop two clinical use cases: Anemia diagnosis, where the decision pathways follow the schema of a decision tree; and Systemic Lupus Erythematosus (SLE) diagnosis, which follows a weighted criteria score. We particularly evaluate the robustness of our approaches to noisy and missing data since these frequently occur in EHRs. Results: In both use cases, and in the presence of imperfect data, our best DRL algorithms exhibit competitive performance when compared to the traditional classifiers, with the added advantage that they enable the progressive generation of a pathway to the suggested diagnosis which can both guide and explain the decision-making process. Conclusion: DRL offers the opportunity to learn personalized decision pathways to diagnosis. We illustrate with our two use cases their advantages: they generate step-by-step pathways that are self-explanatory; and their correctness is competitive when compared to state-of-the-art approaches.

Deep Reinforcement Learning for Personalized Diagnostic Decision Pathways Using Electronic Health Records: A Comparative Study on Anemia and Systemic Lupus Erythematosus

TL;DR

This work treats clinical diagnosis as a sequential decision problem and trains DRL agents to generate patient-specific diagnostic pathways from synthetic EHR data for anemia and SLE. By encoding features as acquisition actions and diagnoses as terminal actions, the approach yields explainable step-by-step reasoning and remains robust to noise and missing values, outperforming traditional classifiers under imperfect data in some scenarios. The study shows that dueling DQN with prioritized experience replay achieves strong accuracy while producing concise pathways, and that pathway-based metrics like wPAHM can balance diagnosis quality with pathway efficiency. The results suggest DRL can augment guidelines by learning adaptable, explainable diagnostic processes that generalize to conditions with tree-like or weighted-criteria diagnosis schemas, with future work focusing on real-world EHR validation and multimodal data integration.

Abstract

Background: Clinical diagnosis is typically reached by following a series of steps recommended by guidelines authored by colleges of experts. Accordingly, guidelines play a crucial role in rationalizing clinical decisions but suffer from limitations as they are built to cover the majority of the population and fail at covering patients with uncommon conditions. Moreover, their updates are long and expensive, making them unsuitable for emerging diseases and practices. Methods: Inspired by guidelines, we formulate the task of diagnosis as a sequential decision-making problem and study the use of Deep Reinforcement Learning (DRL) algorithms to learn the optimal sequence of actions to perform in order to obtain a correct diagnosis from Electronic Health Records (EHRs). We apply DRL on synthetic, but realistic EHRs and develop two clinical use cases: Anemia diagnosis, where the decision pathways follow the schema of a decision tree; and Systemic Lupus Erythematosus (SLE) diagnosis, which follows a weighted criteria score. We particularly evaluate the robustness of our approaches to noisy and missing data since these frequently occur in EHRs. Results: In both use cases, and in the presence of imperfect data, our best DRL algorithms exhibit competitive performance when compared to the traditional classifiers, with the added advantage that they enable the progressive generation of a pathway to the suggested diagnosis which can both guide and explain the decision-making process. Conclusion: DRL offers the opportunity to learn personalized decision pathways to diagnosis. We illustrate with our two use cases their advantages: they generate step-by-step pathways that are self-explanatory; and their correctness is competitive when compared to state-of-the-art approaches.
Paper Structure (45 sections, 4 equations, 14 figures, 14 tables)

This paper contains 45 sections, 4 equations, 14 figures, 14 tables.

Figures (14)

  • Figure 1: The decision tree used to label our anemia dataset, adapted from bmj_anemia and short2013iron.
  • Figure 2: Graphs showing the effect of the $\lambda$ value in the reward function on \ref{['varying_beta_acc']} the accuracy and \ref{['varying_beta_wpahm']} the wPAHM scores for the dueling DQN-PER model.
  • Figure 3: The accuracy-average pathway score tradeoff based on different values of $\lambda$ in the reward function. The results for the dueling DQN-PER model are used here.
  • Figure 4: Accuracy of approaches with varying levels of missingness, noisiness and train set size for the anemia dataset. \ref{['anem_missingness']} shows the mean accuracy of the models at different missingness levels; \ref{['anem_noisiness']} shows the mean accuracy of the models at different noisiness levels; \ref{['anem_missingness_noisiness']} shows the mean accuracy of the models at a constant noisiness level (0.2) and different missingness levels; \ref{['anem_varying_sizes']} shows the mean accuracy and the 95% confidence interval of the models based on the size of the train set. The y-axis of all the plots starts at 60 to make the performance difference clearer.
  • Figure 5: Accuracy of approaches with varying levels of missingness, noisiness, and train set size, with the lupus dataset. The graphs show the mean accuracy of the models at different \ref{['lupus_missingness']} missingness levels; \ref{['lupus_noisiness']} noisiness levels; \ref{['lupus_missingness_noisiness']} missingness levels at a constant noisiness level (0.2). \ref{['lupus_varying_sizes']} shows the mean accuracy and the 95% confidence interval of the models as a function of the size of the train set. The y-axis of all the plots starts at 60 to make the performance difference clearer.
  • ...and 9 more figures