Practical considerations when designing an online learning algorithm for an app-based mHealth intervention
Rachel T Gonzalez, Madeline R Abbott, Brahmajee Nallamothu, Scott Hummel, Michael Dorsch, Walter Dempsey
TL;DR
The paper tackles the practical design challenges of deploying online reinforcement learning in app-based mHealth trials by detailing LS4L2, a contextual bandit system that optimizes notifications through a probabilistic mapping learned via hierarchical Bayesian logistic regression. It presents a concrete template addressing reward definition, optimization timescale, automated learning robustness, computational trade-offs, and missing data handling, supported by simulation comparing LS4L2 with simpler and more complex baselines. Key contributions include a principled approach to partial pooling with weak priors to prevent model breakdowns, a monitoring framework for both algorithm performance and calibration, and actionable guidelines for model specification under resource constraints. The findings highlight the necessity of balancing personalization with computation and emphasize alignment between reward design and behavioral targets, offering practical pathways for scalable, stable, and interpretable RL-enabled digital interventions in real-world clinical trials.
Abstract
The ubiquitous nature of mobile health (mHealth) technology has expanded opportunities for the integration of reinforcement learning into traditional clinical trial designs, allowing researchers to learn individualized treatment policies during the study. LowSalt4Life 2 (LS4L2) is a recent trial aimed at reducing sodium intake among hypertensive individuals through an app-based intervention. A reinforcement learning algorithm, which was deployed in one of the trial arms, was designed to send reminder notifications to promote app engagement in contexts where the notification would be effective, i.e., when a participant is likely to open the app in the next 30-minute and not when prior data suggested reduced effectiveness. Such an algorithm can improve app-based mHealth interventions by reducing participant burden and more effectively promoting behavior change. We encountered various challenges during the implementation of the learning algorithm, which we present as a template to solving challenges in future trials that deploy reinforcement learning algorithms. We provide template solutions based on LS4L2 for solving the key challenges of (i) defining a relevant reward, (ii) determining a meaningful timescale for optimization, (iii) specifying a robust statistical model that allows for automation, (iv) balancing model flexibility with computational cost, and (v) addressing missing values in gradually collected data.
