Designing an Interpretable Interface for Contextual Bandits
Andrew Maher, Matia Gobbo, Lancelot Lachartre, Subash Prabanantham, Rowan Swiers, Puli Liyanagama
TL;DR
This work tackles the interpretability gap of contextual bandits for domain-expert operators by introducing an interpretable production dashboard built around a value gain metric derived from off-policy evaluation. The value gain is defined as $g(\tau) = v^{\pi} - v^{\overline{\pi}}$, with $v^{\pi} = \mathbb{E}_{\pi}[r]$ and $v^{\overline{\pi}}$ estimated via off-policy methods such as inverse propensity scoring $v^{\overline{\pi}} = \frac{1}{n} \sum_{i=1}^{n} \frac{\overline{\pi}(a|x_i)}{\pi(a|x_i)} r_i$. The interface provides top-level, arm-level, and context-level visualizations (including a radar chart and context-contribution bars) to support ablation-style reasoning about component value. A qualitative user study with three marketing professionals demonstrates that, when paired with accessible explanations, technical metrics can be understood and used to guide production decisions, yielding practical design principles for future interpretable dashboards in bandit settings. The work argues for integrating rigorous, technically grounded measures with clear presentation to empower non-experts in managing complex ML systems in production.
Abstract
Contextual bandits have become an increasingly popular solution for personalized recommender systems. Despite their growing use, the interpretability of these systems remains a significant challenge, particularly for the often non-expert operators tasked with ensuring their optimal performance. In this paper, we address this challenge by designing a new interface to explain to domain experts the underlying behaviour of a bandit. Central is a metric we term "value gain", a measure derived from off-policy evaluation to quantify the real-world impact of sub-components within a bandit. We conduct a qualitative user study to evaluate the effectiveness of our interface. Our findings suggest that by carefully balancing technical rigour with accessible presentation, it is possible to empower non-experts to manage complex machine learning systems. We conclude by outlining guiding principles that other researchers should consider when building similar such interfaces in future.
