Table of Contents
Fetching ...

Reinforcement Learning Interventions on Boundedly Rational Human Agents in Frictionful Tasks

Eura Nofshin, Siddharth Swaroop, Weiwei Pan, Susan Murphy, Finale Doshi-Velez

TL;DR

This work introduces Behavior Model RL (BMRL), a framework for interventions that edit maladapted human MDP parameters to guide frictionful tasks toward long-term goals. By modeling humans as boundedly rational planners and allowing the AI to transiently modify parameters like the discount factor $\gamma$ or rewards $R$, BMRL achieves rapid, interpretable personalization. The chainworld model provides analytical solutions and a three-window AI policy class, and the authors prove AI-equivalence results that extend applicability to more realistic human models (monotonic, progress, multi-chain, and negative-effect worlds). Empirical results show robust, online personalization under misspecification and demonstrate the framework's potential for domains such as physical therapy, medication adherence, and digital learning, while highlighting ethical considerations for real-world deployment.

Abstract

Many important behavior changes are frictionful; they require individuals to expend effort over a long period with little immediate gratification. Here, an artificial intelligence (AI) agent can provide personalized interventions to help individuals stick to their goals. In these settings, the AI agent must personalize rapidly (before the individual disengages) and interpretably, to help us understand the behavioral interventions. In this paper, we introduce Behavior Model Reinforcement Learning (BMRL), a framework in which an AI agent intervenes on the parameters of a Markov Decision Process (MDP) belonging to a boundedly rational human agent. Our formulation of the human decision-maker as a planning agent allows us to attribute undesirable human policies (ones that do not lead to the goal) to their maladapted MDP parameters, such as an extremely low discount factor. Furthermore, we propose a class of tractable human models that captures fundamental behaviors in frictionful tasks. Introducing a notion of MDP equivalence specific to BMRL, we theoretically and empirically show that AI planning with our human models can lead to helpful policies on a wide range of more complex, ground-truth humans.

Reinforcement Learning Interventions on Boundedly Rational Human Agents in Frictionful Tasks

TL;DR

This work introduces Behavior Model RL (BMRL), a framework for interventions that edit maladapted human MDP parameters to guide frictionful tasks toward long-term goals. By modeling humans as boundedly rational planners and allowing the AI to transiently modify parameters like the discount factor or rewards , BMRL achieves rapid, interpretable personalization. The chainworld model provides analytical solutions and a three-window AI policy class, and the authors prove AI-equivalence results that extend applicability to more realistic human models (monotonic, progress, multi-chain, and negative-effect worlds). Empirical results show robust, online personalization under misspecification and demonstrate the framework's potential for domains such as physical therapy, medication adherence, and digital learning, while highlighting ethical considerations for real-world deployment.

Abstract

Many important behavior changes are frictionful; they require individuals to expend effort over a long period with little immediate gratification. Here, an artificial intelligence (AI) agent can provide personalized interventions to help individuals stick to their goals. In these settings, the AI agent must personalize rapidly (before the individual disengages) and interpretably, to help us understand the behavioral interventions. In this paper, we introduce Behavior Model Reinforcement Learning (BMRL), a framework in which an AI agent intervenes on the parameters of a Markov Decision Process (MDP) belonging to a boundedly rational human agent. Our formulation of the human decision-maker as a planning agent allows us to attribute undesirable human policies (ones that do not lead to the goal) to their maladapted MDP parameters, such as an extremely low discount factor. Furthermore, we propose a class of tractable human models that captures fundamental behaviors in frictionful tasks. Introducing a notion of MDP equivalence specific to BMRL, we theoretically and empirically show that AI planning with our human models can lead to helpful policies on a wide range of more complex, ground-truth humans.
Paper Structure (39 sections, 8 theorems, 22 equations, 14 figures, 2 tables)

This paper contains 39 sections, 8 theorems, 22 equations, 14 figures, 2 tables.

Key Result

theorem 1

Suppose we are given: Let $t_h^\text{min} = \min\left \{t_h^0, t_h^\gamma, t_h^b\right \}$ denote the earliest human threshold as a result of any AI action. Let $t_{AI}$ denote the AI intervention threshold, as in def: ai-threshold. Then, the optimal AI policy is, and $\pi_{AI}^*$ belongs to the three-window policy class, $\bar{\Pi}$.

Figures (14)

  • Figure 1: Overview of BMRL. The human agent interacts with the environment as in standard RL. The AI agent's actions affect the human agent. The human agent + environment form the AI environment.
  • Figure 2: Graphical representation of the chainworld.
  • Figure 3: Example of different optimal AI policies for two humans with different chainworld parameters. Each square is a chainworld state. An $a_b$ means AI should select action to reduce $r_b$, while $a_\gamma$ means AI should select action to increase $\gamma$. Red solid and blue dotted lines show start and end of intervention window.
  • Figure 4: When the true human model is a chainworld, our method rapidly personalizes. Plot is AI rewards (y-axis) over multiple episodes (x-axis). Lines in upper-left personalize quicker.
  • Figure 5: Chainworld scales to large gridworlds. Example gridworld on left. Going right, the grid's width (X) and height (Y) increases.
  • ...and 9 more figures

Theorems & Definitions (17)

  • definition 1: AI equivalence of human MDPs
  • definition 2: Human threshold
  • definition 3: AI threshold
  • theorem 1: Chainworld AI policies
  • theorem 2: Chainworld equivalence class
  • definition 4: Monotonic chainworlds
  • definition 5: Progress worlds
  • definition 6: Multi-chain worlds
  • definition 7: Negative effect worlds
  • theorem 3: Chainworld and monotonic chainworld equivalence
  • ...and 7 more