Table of Contents
Fetching ...

AlphaFree: Recommendation Free from Users, IDs, and GNNs

Minseo Jeon, Junwoo Jung, Daewon Gwak, Jinhong Jung

TL;DR

The main ideas are to infer preferences on-the-fly without user embeddings (user-free), replace raw IDs with language representations (LRs) from pre-trained language models (ID-free), and capture collaborative signals through augmentation with similar items and contrastive learning, without GNNs (GNN-free).

Abstract

Can we design effective recommender systems free from users, IDs, and GNNs? Recommender systems are central to personalized content delivery across domains, with top-K item recommendation being a fundamental task to retrieve the most relevant items from historical interactions. Existing methods rely on entrenched design conventions, often adopted without reconsideration, such as storing per-user embeddings (user-dependent), initializing features from raw IDs (ID-dependent), and employing graph neural networks (GNN-dependent). These dependencies incur several limitations, including high memory costs, cold-start and over-smoothing issues, and poor generalization to unseen interactions. In this work, we propose AlphaFree, a novel recommendation method free from users, IDs, and GNNs. Our main ideas are to infer preferences on-the-fly without user embeddings (user-free), replace raw IDs with language representations (LRs) from pre-trained language models (ID-free), and capture collaborative signals through augmentation with similar items and contrastive learning, without GNNs (GNN-free). Extensive experiments on various real-world datasets show that AlphaFree consistently outperforms its competitors, achieving up to around 40% improvements over non-LR-based methods and up to 5.7% improvements over LR-based methods, while significantly reducing GPU memory usage by up to 69% under high-dimensional LRs.

AlphaFree: Recommendation Free from Users, IDs, and GNNs

TL;DR

The main ideas are to infer preferences on-the-fly without user embeddings (user-free), replace raw IDs with language representations (LRs) from pre-trained language models (ID-free), and capture collaborative signals through augmentation with similar items and contrastive learning, without GNNs (GNN-free).

Abstract

Can we design effective recommender systems free from users, IDs, and GNNs? Recommender systems are central to personalized content delivery across domains, with top-K item recommendation being a fundamental task to retrieve the most relevant items from historical interactions. Existing methods rely on entrenched design conventions, often adopted without reconsideration, such as storing per-user embeddings (user-dependent), initializing features from raw IDs (ID-dependent), and employing graph neural networks (GNN-dependent). These dependencies incur several limitations, including high memory costs, cold-start and over-smoothing issues, and poor generalization to unseen interactions. In this work, we propose AlphaFree, a novel recommendation method free from users, IDs, and GNNs. Our main ideas are to infer preferences on-the-fly without user embeddings (user-free), replace raw IDs with language representations (LRs) from pre-trained language models (ID-free), and capture collaborative signals through augmentation with similar items and contrastive learning, without GNNs (GNN-free). Extensive experiments on various real-world datasets show that AlphaFree consistently outperforms its competitors, achieving up to around 40% improvements over non-LR-based methods and up to 5.7% improvements over LR-based methods, while significantly reducing GPU memory usage by up to 69% under high-dimensional LRs.
Paper Structure (34 sections, 8 theorems, 18 equations, 10 figures, 9 tables, 3 algorithms)

This paper contains 34 sections, 8 theorems, 18 equations, 10 figures, 9 tables, 3 algorithms.

Key Result

Theorem 1

In the preprocessing phase, AlphaFree takes $O(T_{\textnormal{LM}}n+ mn + (d_{\textnormal{LR}} + \log{K_c})n^2)$ time, where $T_{\textnormal{LM}}$ is the time used by the LM. Its training phase requires $O((T_i d n + K_c m + d h)d_{\textnormal{LR}} + (n_s + |\bm{\mathcal{B}}|)hd)$ time for each epoc

Figures (10)

  • Figure 1: Highlight of the motivation and effectiveness of AlphaFree. (Top) Existing recommender systems depend on user embeddings, item IDs, or GNNs, leading to high memory cost, cold-start limitations, and over-smoothing. (Bottom) AlphaFree overcomes these challenges by performing recommendations free from users, IDs, and GNNs, showing 69% less memory, 26–36% higher accuracy in cold-start cases, and up to around 50% improvement for heavy users.
  • Figure 2: Overall process of AlphaFree, consisting of preprocessing (item LRs and augmentations), training (contrastive alignment of the original and augmented views), and inference (using only the original-view encoder). See Table \ref{['tab:symbols']} for symbols.
  • Figure 3: Examples of behavioral–semantic similarities (min-max normalized) for query items $i$, where $K_c=10$, blue circles denote $\mathcal{C}_i$ based on $\texttt{sim}_{B}$, and the green histogram shows the distribution of $\texttt{sim}_{S}$. A 25% threshold based on $\texttt{sim}_{S}$ yields overly semantically similar items, the 75% threshold includes weakly related ones, while $\mu_i$ provides a balanced trade-off.
  • Figure 4: Performance of AlphaFree and AlphaRec in NDCG@20 across groups with different numbers of interactions. The green line indicates the relative improvement of AlphaFree over AlphaRec. AlphaFree shows consistent improvements, which grow with more interactions.
  • Figure 5: GPU memory usage of AlphaFree and AlphaRec during training and inference, showing AlphaFree consistently requires less memory, and AlphaRec runs out of memory (o.o.m.) on large datasets such as Beauty and Health.
  • ...and 5 more figures

Theorems & Definitions (8)

  • Theorem 1: Time Complexities of AlphaFree
  • Theorem 2: Space Complexities of AlphaFree
  • Lemma 3: Time Complexity for Preprocessing
  • Lemma 4: Time Complexity for Training
  • Lemma 5: Time Complexity for Inference
  • Lemma 6: Space Complexity for Preprocessing
  • Lemma 7: Space Complexity for Training
  • Lemma 8: Space Complexity for Inference