Thompson sampling for zero-inflated count outcomes with an application to the Drink Less mobile health study
Xueqing Liu, Nina Deliu, Tanujit Chakraborty, Lauren Bell, Bibhas Chakraborty
TL;DR
This work addresses online decision-making for mHealth JITAI with count proximal outcomes that exhibit overdispersion and zero-inflation. The authors develop TS-Count, a family of Thompson sampling algorithms that couple Poisson, negative binomial, ZIP, and ZINB count models with online learning, using Laplace approximations for tractable posterior sampling. They provide Bayesian and frequentist regret bounds and demonstrate, through simulations and a Drink Less MRT case study, that TS-Count generally improves cumulative proximal outcomes and user engagement, with Poisson-based variants often performing best in practice. The work offers a practical, scalable approach for count-based online personalization in real-world MRTs, and provides a flexible path for extensions such as pooling information across users and exploring alternative posterior approximation methods.
Abstract
Mobile health (mHealth) interventions often aim to improve distal outcomes, such as clinical conditions, by optimizing proximal outcomes through just-in-time adaptive interventions. Contextual bandits provide a suitable framework for customizing such interventions according to individual time-varying contexts. However, unique challenges, such as modeling count outcomes within bandit frameworks, have hindered the widespread application of contextual bandits to mHealth studies. The current work addresses this challenge by leveraging count data models into online decision-making approaches. Specifically, we combine four common offline count data models (Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial regressions) with Thompson sampling, a popular contextual bandit algorithm. The proposed algorithms are motivated by and evaluated on a real dataset from the Drink Less trial, where they are shown to improve user engagement with the mHealth platform. The proposed methods are further evaluated on simulated data, achieving improvement in maximizing cumulative proximal outcomes over existing algorithms. Theoretical results on regret bounds are also derived. The countts R package provides an implementation of our approach.
