Table of Contents
Fetching ...

TranSUN: A Preemptive Paradigm to Eradicate Retransformation Bias Intrinsically from Regression Models in Recommender Systems

Jiahao Yu, Haozhuang Liu, Yeqiu Yang, Lu Chen, Jian Wu, Yuning Jiang, Bo Zheng

TL;DR

This work addresses retransformation bias in regression for recommender systems by introducing TranSUN, a preemptive bias-learning paradigm that couples a transformed MSE objective with a jointly learned bias branch to achieve unbiased predictions during training. It generalizes the approach into Generalized TranSUN (GTS), a flexible regression framework where a conditional point loss and a dynamic, slope-modulated linear transformation produce unbiased estimates under $-z(x;\theta_z)+Y_x\cdot\kappa(\mathbb{Q}_{y|x}[\mathrm{T}(y)]) \sim \mathcal{N}(0,\sigma^2)$. The authors prove unbiasedness under a stop_grad decomposition, show convergence advantages, and validate performance across synthetic and real-world data, including two Taobao deployment scenarios with online GMV gains. The results indicate that intrinsic debiasing via TranSUN/GTS not only mitigates retransformation bias but also improves high-value sample ranking and online business metrics, offering a practical end-to-end solution that reduces reliance on post-hoc calibration pipelines.

Abstract

Regression models are crucial in recommender systems. However, retransformation bias problem has been conspicuously neglected within the community. While many works in other fields have devised effective bias correction methods, all of them are post-hoc cures externally to the model, facing practical challenges when applied to real-world recommender systems. Hence, we propose a preemptive paradigm to eradicate the bias intrinsically from the models via minor model refinement. Specifically, a novel TranSUN method is proposed with a joint bias learning manner to offer theoretically guaranteed unbiasedness under empirical superior convergence. It is further generalized into a novel generic regression model family, termed Generalized TranSUN (GTS), which not only offers more theoretical insights but also serves as a generic framework for flexibly developing various bias-free models. Comprehensive experimental results demonstrate the superiority of our methods across data from various domains, which have been successfully deployed in two real-world industrial recommendation scenarios, i.e. product and short video recommendation scenarios in Guess What You Like business domain in the homepage of Taobao App (a leading e-commerce platform with DAU > 300M), to serve the major online traffic.

TranSUN: A Preemptive Paradigm to Eradicate Retransformation Bias Intrinsically from Regression Models in Recommender Systems

TL;DR

This work addresses retransformation bias in regression for recommender systems by introducing TranSUN, a preemptive bias-learning paradigm that couples a transformed MSE objective with a jointly learned bias branch to achieve unbiased predictions during training. It generalizes the approach into Generalized TranSUN (GTS), a flexible regression framework where a conditional point loss and a dynamic, slope-modulated linear transformation produce unbiased estimates under . The authors prove unbiasedness under a stop_grad decomposition, show convergence advantages, and validate performance across synthetic and real-world data, including two Taobao deployment scenarios with online GMV gains. The results indicate that intrinsic debiasing via TranSUN/GTS not only mitigates retransformation bias but also improves high-value sample ranking and online business metrics, offering a practical end-to-end solution that reduces reliance on post-hoc calibration pipelines.

Abstract

Regression models are crucial in recommender systems. However, retransformation bias problem has been conspicuously neglected within the community. While many works in other fields have devised effective bias correction methods, all of them are post-hoc cures externally to the model, facing practical challenges when applied to real-world recommender systems. Hence, we propose a preemptive paradigm to eradicate the bias intrinsically from the models via minor model refinement. Specifically, a novel TranSUN method is proposed with a joint bias learning manner to offer theoretically guaranteed unbiasedness under empirical superior convergence. It is further generalized into a novel generic regression model family, termed Generalized TranSUN (GTS), which not only offers more theoretical insights but also serves as a generic framework for flexibly developing various bias-free models. Comprehensive experimental results demonstrate the superiority of our methods across data from various domains, which have been successfully deployed in two real-world industrial recommendation scenarios, i.e. product and short video recommendation scenarios in Guess What You Like business domain in the homepage of Taobao App (a leading e-commerce platform with DAU > 300M), to serve the major online traffic.

Paper Structure

This paper contains 72 sections, 31 equations, 4 figures, 16 tables.

Figures (4)

  • Figure 1: Illustration of biasedness and convergence of MSE, LogMSE, and our TranSUN ($\mathrm{T}$ is logarithmic) during the training on the Indus dataset, where PGR denotes the ratio of the prediction mean to the ground truth mean in a batch and the loss value is normalized by the maximum value. As illustrated, MSE model has difficulty converging, while LogMSE model is significantly biased (i.e. PGR < 1.0). Differently, our method simultaneously maintains prediction unbiasedness (i.e. PGR is near 1.0) and superior convergence.
  • Figure 2: Density and statistics of the sampled $\mathcal{P}_s(\mathrm{Y}|x)$ from the eight pre-defined $\mathcal{H}(\mathrm{Y})$.
  • Figure 3: Visualization results across experiments. (a) Training loss curves for LogSUN with different schemes on CIKM16. (b) Training curves of TRE and MRE for LogSUN with different schemes on CIKM16. (c) Signed TRE (STRE) and average slope $\bar{\kappa}$ on divided target bins of Indus test. (d) The sensitivity of LogSUN to $\epsilon$ on CIKM16 test.
  • Figure 4: A graphical comparison of the advantages and model assumptions among Mean Squared Error (MSE), transformed MSE (T-MSE), Normalizing Flow (NF), Conditional Transformation Models (CTM), Conditional Linear Transformation Models (CLTM), Mixture Density Model (MDM), TranSUN, and Generalized TranSUN (GTS).