Embedding Compression in Recommender Systems: A Survey
Shiwei Li, Huifeng Guo, Xing Tang, Ruiming Tang, Lu Hou, Ruixuan Li, Rui Zhang
TL;DR
This survey addresses the memory and latency challenges posed by the massive embedding tables in deep learning recommender systems. It introduces a three-fold taxonomy—low-precision, mixed-dimension, and weight-sharing—to classify embedding compression methods, and provides detailed coverage of binarization, quantization, rule-based NAS-driven and pruning strategies, and hashing, vector quantization, and decomposition techniques. The work analyzes the trade-offs in accuracy, inference/training efficiency, and training memory, and offers general guidelines and future directions for practitioners and researchers. By consolidating diverse approaches and clarifying their suitability under different constraints, the paper aims to guide effective deployment of compressed embeddings in industrial-scale recommender systems and spur further methodological advances.
Abstract
To alleviate the problem of information explosion, recommender systems are widely deployed to provide personalized information filtering services. Usually, embedding tables are employed in recommender systems to transform high-dimensional sparse one-hot vectors into dense real-valued embeddings. However, the embedding tables are huge and account for most of the parameters in industrial-scale recommender systems. In order to reduce memory costs and improve efficiency, various approaches are proposed to compress the embedding tables. In this survey, we provide a comprehensive review of embedding compression approaches in recommender systems. We first introduce deep learning recommendation models and the basic concept of embedding compression in recommender systems. Subsequently, we systematically organize existing approaches into three categories, namely low-precision, mixed-dimension, and weight-sharing, respectively. Lastly, we summarize the survey with some general suggestions and provide future prospects for this field.
