Efficient Learning of Sparse Representations from Interactions
Vojtěch Vančura, Martin Spišák, Rodrigo Alves, Ladislav Peška
TL;DR
This work tackles the challenge of deploying compact yet expressive item representations in large-scale recommender systems by training sparse, high‑dimensional embeddings. It introduces Compressed ELSA, which enforces row‑wise sparsity via gradual pruning and enables efficient inference with CSC/SpMV, while producing interpretable item segments through dominant latent factors and semantic merging. The approach achieves near‑dense accuracy with up to 100x compression, outperforms post‑hoc sparse methods, and provides segment‑level explainability that can guide unified item and segment recommendations. The practical impact lies in enabling scalable, interpretable retrieval pipelines without sacrificing retrieval quality, with code and demos available for deployment and exploration.
Abstract
Behavioral patterns captured in embeddings learned from interaction data are pivotal across various stages of production recommender systems. However, in the initial retrieval stage, practitioners face an inherent tradeoff between embedding expressiveness and the scalability and latency of serving components, resulting in the need for representations that are both compact and expressive. To address this challenge, we propose a training strategy for learning high-dimensional sparse embedding layers in place of conventional dense ones, balancing efficiency, representational expressiveness, and interpretability. To demonstrate our approach, we modified the production-grade collaborative filtering autoencoder ELSA, achieving up to 10x reduction in embedding size with no loss of recommendation accuracy, and up to 100x reduction with only a 2.5% loss. Moreover, the active embedding dimensions reveal an interpretable inverted-index structure that segments items in a way directly aligned with the model's latent space, thereby enabling integration of segment-level recommendation functionality (e.g., 2D homepage layouts) within the candidate retrieval model itself. Source codes, additional results, as well as a live demo are available at https://github.com/zombak79/compressed_elsa
