TranSUN: A Preemptive Paradigm to Eradicate Retransformation Bias Intrinsically from Regression Models in Recommender Systems
Jiahao Yu, Haozhuang Liu, Yeqiu Yang, Lu Chen, Jian Wu, Yuning Jiang, Bo Zheng
TL;DR
This work addresses retransformation bias in regression for recommender systems by introducing TranSUN, a preemptive bias-learning paradigm that couples a transformed MSE objective with a jointly learned bias branch to achieve unbiased predictions during training. It generalizes the approach into Generalized TranSUN (GTS), a flexible regression framework where a conditional point loss and a dynamic, slope-modulated linear transformation produce unbiased estimates under $-z(x;\theta_z)+Y_x\cdot\kappa(\mathbb{Q}_{y|x}[\mathrm{T}(y)]) \sim \mathcal{N}(0,\sigma^2)$. The authors prove unbiasedness under a stop_grad decomposition, show convergence advantages, and validate performance across synthetic and real-world data, including two Taobao deployment scenarios with online GMV gains. The results indicate that intrinsic debiasing via TranSUN/GTS not only mitigates retransformation bias but also improves high-value sample ranking and online business metrics, offering a practical end-to-end solution that reduces reliance on post-hoc calibration pipelines.
Abstract
Regression models are crucial in recommender systems. However, retransformation bias problem has been conspicuously neglected within the community. While many works in other fields have devised effective bias correction methods, all of them are post-hoc cures externally to the model, facing practical challenges when applied to real-world recommender systems. Hence, we propose a preemptive paradigm to eradicate the bias intrinsically from the models via minor model refinement. Specifically, a novel TranSUN method is proposed with a joint bias learning manner to offer theoretically guaranteed unbiasedness under empirical superior convergence. It is further generalized into a novel generic regression model family, termed Generalized TranSUN (GTS), which not only offers more theoretical insights but also serves as a generic framework for flexibly developing various bias-free models. Comprehensive experimental results demonstrate the superiority of our methods across data from various domains, which have been successfully deployed in two real-world industrial recommendation scenarios, i.e. product and short video recommendation scenarios in Guess What You Like business domain in the homepage of Taobao App (a leading e-commerce platform with DAU > 300M), to serve the major online traffic.
