Causal Feature Selection Method for Contextual Multi-Armed Bandits in Recommender System
Zhenyu Zhao, Yexi Jiang
TL;DR
The paper tackles feature selection for contextual multi-armed bandits by focusing on heterogeneous treatment effects rather than mere outcome correlation. It introduces two model-free filter methods, Heterogeneous Incremental Effect (HIE) and Heterogeneous Distribution Divergence (HDD), which quantify how features influence arm selection and reward distributions through bin-based analyses and bootstrap-based significance testing. The methods are designed to be computationally efficient and robust to model mis-specification, enabling rapid offline screening in large-scale systems. Empirical results on synthetic data and a real online recommender deployment demonstrate that HIE and HDD reliably identify influential HTE features and translate into improved CMAB performance online.
Abstract
Effective feature selection is essential for optimizing contextual multi-armed bandits (CMABs) in large-scale online systems, where suboptimal features can degrade rewards, interpretability, and efficiency. Traditional feature selection often prioritizes outcome correlation, neglecting the crucial role of heterogeneous treatment effects (HTE) across arms in CMAB decision-making. This paper introduces two novel, model-free filter methods, Heterogeneous Incremental Effect (HIE) and Heterogeneous Distribution Divergence (HDD), specifically designed to identify features driving HTE. HIE quantifies a feature's value based on its ability to induce changes in the optimal arm, while HDD measures its impact on reward distribution divergence across arms. These methods are computationally efficient, robust to model mis-specification, and adaptable to various feature types, making them suitable for rapid screening in dynamic environments where retraining complex models is infeasible. We validate HIE and HDD on synthetic data with known ground truth and in a large-scale commercial recommender system, demonstrating their consistent ability to identify influential HTE features and thereby enhance CMAB performance.
