Reproducible workflow for online AI in digital health
Susobhan Ghosh, Bhanu T. Gullapalli, Daiqi Gao, Asim Gazi, Anna Trella, Ziping Xu, Kelly Zhang, Susan A. Murphy
TL;DR
This paper tackles the challenge of achieving reproducibility for online AI decisions in digital health interventions. It proposes a three-phase workflow (design, integration/deployment, post-deployment inference) anchored by digital twins and rigorous logging to enable replay, auditing, and cross-deployment comparison. Through real-world deployments HeartSteps, Oralytics, and MiWaves, it provides concrete practices for environment isolation, autonomous algorithm design with stability guarantees, comprehensive data/state logging, version control, and monitoring with safe fallbacks. The goal is to support trustworthy, iterative improvement of adaptive digital health interventions while acknowledging practical constraints from commercial devices and platforms.
Abstract
Online artificial intelligence (AI) algorithms are an important component of digital health interventions. These online algorithms are designed to continually learn and improve their performance as streaming data is collected on individuals. Deploying online AI presents a key challenge: balancing adaptability of online AI with reproducibility. Online AI in digital interventions is a rapidly evolving area, driven by advances in algorithms, sensors, software, and devices. Digital health intervention development and deployment is a continuous process, where implementation - including the AI decision-making algorithm - is interspersed with cycles of re-development and optimization. Each deployment informs the next, making iterative deployment a defining characteristic of this field. This iterative nature underscores the importance of reproducibility: data collected across deployments must be accurately stored to have scientific utility, algorithm behavior must be auditable, and results must be comparable over time to facilitate scientific discovery and trustworthy refinement. This paper proposes a reproducible scientific workflow for developing, deploying, and analyzing online AI decision-making algorithms in digital health interventions. Grounded in practical experience from multiple real-world deployments, this workflow addresses key challenges to reproducibility across all phases of the online AI algorithm development life-cycle.
