FairFS: Addressing Deep Feature Selection Biases for Recommender System
Xianquan Wang, Zhaocheng Du, Jieming Zhu, Qinglin Jia, Zhenhua Dong, Kai Zhang
TL;DR
This work tackles biases in deep feature selection for recommender systems by identifying layer bias, baseline bias, and approximation bias in gate-based and sensitivity-based methods. It introduces FairFS, which combines aggregated gradient-based feature importance with a smoothing baseline and an aggregated-approximation strategy to produce unbiased, sparse feature selections. The approach yields state-of-the-art performance on three public datasets and demonstrates real-world impact through an online A/B test showing improvements in ECPM and latency. The results suggest FairFS as a practical tool for reducing unnecessary features in industrial recommender pipelines without sacrificing accuracy.
Abstract
Large-scale online marketplaces and recommender systems serve as critical technological support for e-commerce development. In industrial recommender systems, features play vital roles as they carry information for downstream models. Accurate feature importance estimation is critical because it helps identify the most useful feature subsets from thousands of feature candidates for online services. Such selection enables improved online performance while reducing computational cost. To address feature selection problems in deep learning, trainable gate-based and sensitivity-based methods have been proposed and proven effective in industrial practice. However, through the analysis of real-world cases, we identified three bias issues that cause feature importance estimation to rely on partial model layers, samples, or gradients, ultimately leading to inaccurate importance estimation. We refer to these as layer bias, baseline bias, and approximation bias. To mitigate these issues, we propose FairFS, a fair and accurate feature selection algorithm. FairFS regularizes feature importance estimated across all nonlinear transformation layers to address layer bias. It also introduces a smooth baseline feature close to the classifier decision boundary and adopts an aggregated approximation method to alleviate baseline and approximation biases. Extensive experiments demonstrate that FairFS effectively mitigates these biases and achieves state-of-the-art feature selection performance.
