Table of Contents
Fetching ...

EDGE-Rec: Efficient and Data-Guided Edge Diffusion For Recommender Systems Graphs

Utkarsh Priyam, Hemit Shah, Edoardo Botta

TL;DR

This work proposes a new attention mechanism, loosely based on the principles of collaborative filtering, called Row-Column Separable Attention RCSA to take advantage of real-valued interaction weights as well as user and item features directly.

Abstract

Most recommender systems research focuses on binary historical user-item interaction encodings to predict future interactions. User features, item features, and interaction strengths remain largely under-utilized in this space or only indirectly utilized, despite proving largely effective in large-scale production recommendation systems. We propose a new attention mechanism, loosely based on the principles of collaborative filtering, called Row-Column Separable Attention RCSA to take advantage of real-valued interaction weights as well as user and item features directly. Building on this mechanism, we additionally propose a novel Graph Diffusion Transformer GDiT architecture which is trained to iteratively denoise the weighted interaction matrix of the user-item interaction graph directly. The weighted interaction matrix is built from the bipartite structure of the user-item interaction graph and corresponding edge weights derived from user-item rating interactions. Inspired by the recent progress in text-conditioned image generation, our method directly produces user-item rating predictions on the same scale as the original ratings by conditioning the denoising process on user and item features with a principled approach.

EDGE-Rec: Efficient and Data-Guided Edge Diffusion For Recommender Systems Graphs

TL;DR

This work proposes a new attention mechanism, loosely based on the principles of collaborative filtering, called Row-Column Separable Attention RCSA to take advantage of real-valued interaction weights as well as user and item features directly.

Abstract

Most recommender systems research focuses on binary historical user-item interaction encodings to predict future interactions. User features, item features, and interaction strengths remain largely under-utilized in this space or only indirectly utilized, despite proving largely effective in large-scale production recommendation systems. We propose a new attention mechanism, loosely based on the principles of collaborative filtering, called Row-Column Separable Attention RCSA to take advantage of real-valued interaction weights as well as user and item features directly. Building on this mechanism, we additionally propose a novel Graph Diffusion Transformer GDiT architecture which is trained to iteratively denoise the weighted interaction matrix of the user-item interaction graph directly. The weighted interaction matrix is built from the bipartite structure of the user-item interaction graph and corresponding edge weights derived from user-item rating interactions. Inspired by the recent progress in text-conditioned image generation, our method directly produces user-item rating predictions on the same scale as the original ratings by conditioning the denoising process on user and item features with a principled approach.
Paper Structure (18 sections, 5 equations, 13 figures)

This paper contains 18 sections, 5 equations, 13 figures.

Figures (13)

  • Figure 1: User-item interaction graph for a movie rating dataset. Each edge weight corresponds to the user-provided rating of the movie. Users and movies are linked via only their rating interactions, hence the graph is bipartite.
  • Figure 2: DiffRec - Starting from a one-hot vector of historical interactions, the forward process noises the user's interaction history until timestep $T$ by the transition step $q(x_t|x_{t-1})$. The model is trained to recover $x_0$ using $p_\theta(x_{t-1}|x_t)$wang2023diffusion.
  • Figure 3: Full adjacency matrix for the interaction graph provided in \ref{['fig:rating_graph']}. Notice the empty quadrants in the top left and bottom right of the matrix due to the bipartite graph.
  • Figure 4: Weighted interaction matrix for the same graph (with non-existent interactions as ' ---'). Our proposed diffusion method operates over these weighted interaction matrices.
  • Figure 5: A submatrix/subsample from the full weighted interaction matrix including additional features provided to the denoising model (latent representations for user features, $u_i$, and item features, $i_j$). We also show our custom attention mechanism's operation for the user-item pair $(u_3, i_2)$, attending over other users' ratings for the same item (column), as well as the same user's ratings of other items (row).
  • ...and 8 more figures