Table of Contents
Fetching ...

Reinforcement Learning on Dyads to Enhance Medication Adherence

Ziping Xu, Hinal Jajal, Sung Won Choi, Inbal Nahum-Shani, Guy Shani, Alexandra M. Psihogios, Pei-Yao Hung, Susan Murphy

TL;DR

The paper addresses medication adherence in AYAs after allogeneic hematopoietic cell transplantation by delivering a three-component digital intervention through a novel MARL framework. It encodes domain knowledge via a causal DAG and employs surrogate rewards to handle delayed mediator effects, with three specialized agents operating at two daily timescales and one weekly timescale to maximize the cumulative adherence $\sum_{w=1}^{14}\sum_{d=1}^{7}\sum_{t=1}^{2} R_{w,d,t}^{AYA}$. In Roadmap 2.0–based simulations with 25 dyads over 14 weeks, MARL approaches outperform single-agent and random baselines, and surrogate rewards provide additional gains across standardized treatment effects (STE) of 0.15, 0.3, and 0.5. These results support advancing to the ADAPTS-HCT trial, while acknowledging limitations of the synthetic environment and the need for validation in real-world recruitment dynamics.

Abstract

Medication adherence is critical for the recovery of adolescents and young adults (AYAs) who have undergone hematopoietic cell transplantation (HCT). However, maintaining adherence is challenging for AYAs after hospital discharge, who experience both individual (e.g. physical and emotional symptoms) and interpersonal barriers (e.g., relational difficulties with their care partner, who is often involved in medication management). To optimize the effectiveness of a three-component digital intervention targeting both members of the dyad as well as their relationship, we propose a novel Multi-Agent Reinforcement Learning (MARL) approach to personalize the delivery of interventions. By incorporating the domain knowledge, the MARL framework, where each agent is responsible for the delivery of one intervention component, allows for faster learning compared with a flattened agent. Evaluation using a dyadic simulator environment, based on real clinical data, shows a significant improvement in medication adherence (approximately 3%) compared to purely random intervention delivery. The effectiveness of this approach will be further evaluated in an upcoming trial.

Reinforcement Learning on Dyads to Enhance Medication Adherence

TL;DR

The paper addresses medication adherence in AYAs after allogeneic hematopoietic cell transplantation by delivering a three-component digital intervention through a novel MARL framework. It encodes domain knowledge via a causal DAG and employs surrogate rewards to handle delayed mediator effects, with three specialized agents operating at two daily timescales and one weekly timescale to maximize the cumulative adherence . In Roadmap 2.0–based simulations with 25 dyads over 14 weeks, MARL approaches outperform single-agent and random baselines, and surrogate rewards provide additional gains across standardized treatment effects (STE) of 0.15, 0.3, and 0.5. These results support advancing to the ADAPTS-HCT trial, while acknowledging limitations of the synthetic environment and the need for validation in real-world recruitment dynamics.

Abstract

Medication adherence is critical for the recovery of adolescents and young adults (AYAs) who have undergone hematopoietic cell transplantation (HCT). However, maintaining adherence is challenging for AYAs after hospital discharge, who experience both individual (e.g. physical and emotional symptoms) and interpersonal barriers (e.g., relational difficulties with their care partner, who is often involved in medication management). To optimize the effectiveness of a three-component digital intervention targeting both members of the dyad as well as their relationship, we propose a novel Multi-Agent Reinforcement Learning (MARL) approach to personalize the delivery of interventions. By incorporating the domain knowledge, the MARL framework, where each agent is responsible for the delivery of one intervention component, allows for faster learning compared with a flattened agent. Evaluation using a dyadic simulator environment, based on real clinical data, shows a significant improvement in medication adherence (approximately 3%) compared to purely random intervention delivery. The effectiveness of this approach will be further evaluated in an upcoming trial.

Paper Structure

This paper contains 27 sections, 11 equations, 4 figures, 8 tables, 4 algorithms.

Figures (4)

  • Figure 1: Causal diagram for ADAPTS-HCT intervention 1. We categorize the variables into three components: AYA component (marked in black), care partner component (marked in red), and relationship component (marked in green). Each component operates at different time scales. Variables in the AYA component evolve on a twice-daily basis, while the care partner component operates on a daily basis. The relationship component operates on a weekly basis. The arrows indicate the direct causal effects.
  • Figure 2: Cumulative adherence improvement over the uniform random policy for all three components under dyadic environments with different STEs. The confidence interval is the standard deviation based on 1000 independent runs.
  • Figure 3: Relationship between the hyperparameters and the STE, categorized by the mediator effect value.
  • Figure 4: Cumulative rewards improvement over the uniform random policy for all three components under the testbed without the effect of care-partner distress onto relationship quality or the effect of relationship quality onto AYA's adherence.