Thompson sampling for zero-inflated count outcomes with an application to the Drink Less mobile health study

Xueqing Liu; Nina Deliu; Tanujit Chakraborty; Lauren Bell; Bibhas Chakraborty

Thompson sampling for zero-inflated count outcomes with an application to the Drink Less mobile health study

Xueqing Liu, Nina Deliu, Tanujit Chakraborty, Lauren Bell, Bibhas Chakraborty

TL;DR

This work addresses online decision-making for mHealth JITAI with count proximal outcomes that exhibit overdispersion and zero-inflation. The authors develop TS-Count, a family of Thompson sampling algorithms that couple Poisson, negative binomial, ZIP, and ZINB count models with online learning, using Laplace approximations for tractable posterior sampling. They provide Bayesian and frequentist regret bounds and demonstrate, through simulations and a Drink Less MRT case study, that TS-Count generally improves cumulative proximal outcomes and user engagement, with Poisson-based variants often performing best in practice. The work offers a practical, scalable approach for count-based online personalization in real-world MRTs, and provides a flexible path for extensions such as pooling information across users and exploring alternative posterior approximation methods.

Abstract

Mobile health (mHealth) interventions often aim to improve distal outcomes, such as clinical conditions, by optimizing proximal outcomes through just-in-time adaptive interventions. Contextual bandits provide a suitable framework for customizing such interventions according to individual time-varying contexts. However, unique challenges, such as modeling count outcomes within bandit frameworks, have hindered the widespread application of contextual bandits to mHealth studies. The current work addresses this challenge by leveraging count data models into online decision-making approaches. Specifically, we combine four common offline count data models (Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial regressions) with Thompson sampling, a popular contextual bandit algorithm. The proposed algorithms are motivated by and evaluated on a real dataset from the Drink Less trial, where they are shown to improve user engagement with the mHealth platform. The proposed methods are further evaluated on simulated data, achieving improvement in maximizing cumulative proximal outcomes over existing algorithms. Theoretical results on regret bounds are also derived. The countts R package provides an implementation of our approach.

Thompson sampling for zero-inflated count outcomes with an application to the Drink Less mobile health study

TL;DR

Abstract

Paper Structure (45 sections, 13 theorems, 108 equations, 8 figures, 1 table, 4 algorithms)

This paper contains 45 sections, 13 theorems, 108 equations, 8 figures, 1 table, 4 algorithms.

Introduction
Motivating Example: the Drink Less Study
Preliminaries
Notations and Problem Setup
Count Data Models: A Review
Poisson Regression
Negative Binomial Regression
Zero-inflated Poisson Regression
Zero-inflated Negative Binomial Regression
The TS-Count Strategy
Laplace Approximation for Tractability
Performance Metrics
Regret Analysis
Simulation Studies
Setup
...and 30 more sections

Key Result

Proposition 4.1

Suppose $X_{t}$ is drawn i.i.d. from some distribution with support in the unit ball, i.e., $||X_{t}|| \leq 1$. Furthermore, let $\Sigma:=\mathbb{E}\left[\phi(A_t,X_t)\phi(A_t,X_t)^{\top}\right]$ be the second moment matrix, and $B$ be a positive constant. Then, there exist positive, universal const

Figures (8)

Figure 1: Graphical inspection of the distribution and normality check of the number of screen views from 8 p.m. to 9 p.m. after each intervention.
Figure 2: Simulation results of the compared algorithms under Settings (1) - (4). The results are calculated from $200$ replications of the experiment. The solid lines indicate the mean values, while the shaded bands represent the standard error bounds across the independent replications.
Figure 3: Simulation results of the compared algorithms under Settings (5) - (8). The results are calculated from $200$ replications of the experiment. The solid lines indicate the mean values, while the shaded bands represent the standard error bounds across the independent replications.
Figure 4: Comparison of the proposed algorithms and their MCMC counterparts focuses on computation time and regret. The results are calculated from $200$ replications of the experiment.
Figure 5: Results of the compared algorithms under the Drink Less study with $p_{\min} = 0.01$ and $p_{\max} = 0.99$. The results are calculated from $100$ replications of the experiment. The solid lines indicate the mean values, while the shaded bands represent the standard error bounds across the independent replications.
...and 3 more figures

Theorems & Definitions (29)

Proposition 4.1
Remark 4.1
Definition 4.1: Frequentist Regret
Definition 4.2: Bayesian Regret
Theorem 5.1
Theorem 5.2
Theorem 5.3
Theorem 5.4
Remark 5.1
Lemma A.1
...and 19 more

Thompson sampling for zero-inflated count outcomes with an application to the Drink Less mobile health study

TL;DR

Abstract

Thompson sampling for zero-inflated count outcomes with an application to the Drink Less mobile health study

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (29)