Table of Contents
Fetching ...

EfficientRec an unlimited user-item scale recommendation system based on clustering and users interaction embedding profile

Vu Hong Quan, Le Hoang Ngan, Le Minh Duc, Nguyen Tran Ngoc Linh, Hoang Quynh-Le

TL;DR

EfficientRec tackles industrial-scale recommendation by decoupling complexity from user counts through interaction embedding on a graph and a triplet-based contrastive objective. A soft clustering approach identifies user groups, while a cluster-voted shortlist restricts the candidate item set, enabling scalable, low-cost inference. The architecture integrates a clustering module with an item-selection pipeline, supported by offline and online evaluations across Movielens-20M, Book-Crossing, and TV360 that show competitive accuracy and substantial speedups. This approach supports unlimited user growth, improved production deployment, and practical applicability in sparse, implicit-feedback environments.

Abstract

Recommendation systems are highly interested in technology companies nowadays. The businesses are constantly growing users and products, causing the number of users and items to continuously increase over time, to very large numbers. Traditional recommendation algorithms with complexity dependent on the number of users and items make them difficult to adapt to the industrial environment. In this paper, we introduce a new method applying graph neural networks with a contrastive learning framework in extracting user preferences. We incorporate a soft clustering architecture that significantly reduces the computational cost of the inference process. Experiments show that the model is able to learn user preferences with low computational cost in both training and prediction phases. At the same time, the model gives a very good accuracy. We call this architecture EfficientRec with the implication of model compactness and the ability to scale to unlimited users and products.

EfficientRec an unlimited user-item scale recommendation system based on clustering and users interaction embedding profile

TL;DR

EfficientRec tackles industrial-scale recommendation by decoupling complexity from user counts through interaction embedding on a graph and a triplet-based contrastive objective. A soft clustering approach identifies user groups, while a cluster-voted shortlist restricts the candidate item set, enabling scalable, low-cost inference. The architecture integrates a clustering module with an item-selection pipeline, supported by offline and online evaluations across Movielens-20M, Book-Crossing, and TV360 that show competitive accuracy and substantial speedups. This approach supports unlimited user growth, improved production deployment, and practical applicability in sparse, implicit-feedback environments.

Abstract

Recommendation systems are highly interested in technology companies nowadays. The businesses are constantly growing users and products, causing the number of users and items to continuously increase over time, to very large numbers. Traditional recommendation algorithms with complexity dependent on the number of users and items make them difficult to adapt to the industrial environment. In this paper, we introduce a new method applying graph neural networks with a contrastive learning framework in extracting user preferences. We incorporate a soft clustering architecture that significantly reduces the computational cost of the inference process. Experiments show that the model is able to learn user preferences with low computational cost in both training and prediction phases. At the same time, the model gives a very good accuracy. We call this architecture EfficientRec with the implication of model compactness and the ability to scale to unlimited users and products.
Paper Structure (16 sections, 5 equations, 8 figures, 5 tables, 2 algorithms)

This paper contains 16 sections, 5 equations, 8 figures, 5 tables, 2 algorithms.

Figures (8)

  • Figure 1: Overall architecture
  • Figure 2: Interactions are considered as a directional graph
  • Figure 3: Interaction embedding model.
  • Figure 4: Split strategies to get positive and negative representation pairs for triplet contrastive training. This splittings ensure the model leans consistency and distinguishing properties
  • Figure 5: Soft clustering compared to hard clustering, for a sparsity dataset, soft clustering lead to more comprehensive view
  • ...and 3 more figures