Beyond Self-Consistency: Loss-Balanced Perturbation-Based Regularization Improves Industrial-Scale Ads Ranking
Ilqar Ramazanli, Hamid Eghbalzadeh, Xiaoyi Liu, Yang Wang, Jiaxiang Fu, Kaushik Rangadurai, Sem Park, Bo Long, Xue Feng
TL;DR
The paper demonstrates the first successful application of perturbation-based regularization in industrial-scale ads ranking, introducing Loss-Balanced Small Perturbation Regularization (LSPR) as a simple, scalable alternative to Self-Consistency Regularization (SCR). Through numerical analysis and large-scale online experiments, LSPR shows stronger weight-space alignment and consistent offline and online gains (e.g., NE improvements of around $0.1$–$0.3\%$ offline and $0.1$–$0.2\%$ online) in a billion-scale ranking system. The work details data augmentation strategies, integration across multiple ranking stages, and practical deployment considerations, providing a framework for applying perturbation-based regularization in industrial recommender systems. Overall, LSPR emerges as a robust, scalable regularization technique that improves ad delivery performance while addressing scalability challenges across surfaces, locations, clients, and events.
Abstract
Perturbation-based regularization techniques address many challenges in industrial-scale large models, particularly with sparse labels, and emphasize consistency and invariance for perturbation in model predictions. One of the popular regularization techniques has been various forms of self-consistency, which involve making small modifications to input data while preserving contextual information and enforcing similar predictions through auxiliary loss functions. In this work, we explore the first successful application of perturbation-based regularization algorithms in large-scale ads ranking models, and further propose a novel regularization algorithm, namely, Loss-Balanced Small Perturbation Regularization (LSPR) that can be used in potentially any deep learning model. We have successfully demonstrate that both Self-Consistency Regularization approaches (SCR) and LSPR are scalable and can improve ads delivery systems. By conducting industrial-scale experiments, and numerical analysis, we additionally show that our proposed LSPR, performs consistently better compared to SCR, across various groups and signal availability setups. Finally, we report a successful application of the proposed LSPR in a billion-scale industrial ranking system, which to the best of our knowledge, is the first of its kind, and it is specially designed to address the various scalability challenges (e.g, various surfaces, geological locations, clients and so on) as we will mention in this paper.
