Adaptive Mixture Importance Sampling for Automated Ads Auction Tuning
Yimeng Jia, Kaushal Paneri, Rong Huang, Kailash Singh Maurya, Pavan Mallapragada, Yifan Shi
TL;DR
This paper addresses counterfactual KPI estimation in dynamic ads auctions where exploration is challenged by multi-modality and noise. It introduces Adaptive Mixture IS (AMIS), which uses a mixture distribution as the proposal and jointly adapts component parameters and mixing rates at every iteration to improve search diversity. The approach yields two policy variants—Multi-Variance and Multi-Modal—with Score-based Greedy (GU) and Rank-based Peak Cluster (PCU) updates—and is validated through offline simulations and online experiments, showing faster convergence and improved robustness over Gaussian IS. The results indicate AMIS enhances decision-making for off-policy estimators in large-scale ads tuning and helps identify configurations more likely to be adopted in production.
Abstract
This paper introduces Adaptive Mixture Importance Sampling (AMIS) as a novel approach for optimizing key performance indicators (KPIs) in large-scale recommender systems, such as online ad auctions. Traditional importance sampling (IS) methods face challenges in dynamic environments, particularly in navigating through complexities of multi-modal landscapes and avoiding entrapment in local optima for the optimization task. Instead of updating importance weights and mixing samples across iterations, as in canonical adaptive IS and multiple IS, our AMIS framework leverages a mixture distribution as the proposal distribution and dynamically adjusts both the mixture parameters and their mixing rates at each iteration, thereby enhancing search diversity and efficiency. Through extensive offline simulations, we demonstrate that AMIS significantly outperforms simple Gaussian Importance Sampling (GIS), particularly in noisy environments. Moreover, our approach is validated in real-world scenarios through online A/B experiments on a major search engine, where AMIS consistently identifies optimal tuning points that are more likely to be adopted as mainstream configurations. These findings indicate that AMIS enhances convergence in noisy environments, leading to more accurate and reliable decision-making in the context of importance sampling off-policy estimators.
