Do We Really Need Graph Convolution During Training? Light Post-Training Graph-ODE for Efficient Recommendation

Weizhi Zhang; Liangwei Yang; Zihe Song; Henry Peng Zou; Ke Xu; Liancheng Fang; Philip S. Yu

Do We Really Need Graph Convolution During Training? Light Post-Training Graph-ODE for Efficient Recommendation

Weizhi Zhang, Liangwei Yang, Zihe Song, Henry Peng Zou, Ke Xu, Liancheng Fang, Philip S. Yu

TL;DR

This work questions the necessity of graph convolutions during the training phase of graph-based recommender systems. It introduces LightGODE, a post-training, light-weight graph convolution framework augmented with a continuous graph ODE to capture higher-order structure while minimizing embedding drift. By pre-training lightweight embeddings, applying a discrete GCN with self-loop, and then evolving representations via a continuous ODE, LightGODE achieves superior efficiency and accuracy across real-world datasets, notably on Gowalla. The findings suggest that most of the benefits of graph convolutions arise during testing, and that a post-training, continuous formulation can outperform traditional GCN-based training while remaining scalable to large graphs.

Abstract

The efficiency and scalability of graph convolution networks (GCNs) in training recommender systems (RecSys) have been persistent concerns, hindering their deployment in real-world applications. This paper presents a critical examination of the necessity of graph convolutions during the training phase and introduces an innovative alternative: the Light Post-Training Graph Ordinary-Differential-Equation (LightGODE). Our investigation reveals that the benefits of GCNs are more pronounced during testing rather than training. Motivated by this, LightGODE utilizes a novel post-training graph convolution method that bypasses the computation-intensive message passing of GCNs and employs a non-parametric continuous graph ordinary-differential-equation (ODE) to dynamically model node representations. This approach drastically reduces training time while achieving fine-grained post-training graph convolution to avoid the distortion of the original training embedding space, termed the embedding discrepancy issue. We validate our model across several real-world datasets of different scales, demonstrating that LightGODE not only outperforms GCN-based models in terms of efficiency and effectiveness but also significantly mitigates the embedding discrepancy commonly associated with deeper graph convolution layers. Our LightGODE challenges the prevailing paradigms in RecSys training and suggests re-evaluating the role of graph convolutions, potentially guiding future developments of efficient large-scale graph-based RecSys.

Do We Really Need Graph Convolution During Training? Light Post-Training Graph-ODE for Efficient Recommendation

TL;DR

Abstract

Paper Structure (32 sections, 19 equations, 9 figures, 6 tables)

This paper contains 32 sections, 19 equations, 9 figures, 6 tables.

Introduction
Investigation of the Graph Convolution for Recommendation
The Role and Necessity of Graph Convolution during Training
The Alignment Force: A DFS Perspective
Trade-off in Designing Graph Convolution
Light Post-Training Graph-ODE for Efficient Recommendation
Pre-training User/Item Embedding
Discrete GCN with Self-Loop
Continuous Graph-ODE
Time Complexity Analysis
Experiments
Datasets
Overall Performance Comparison
Ablation Study
Efficiency Analysis
...and 17 more sections

Figures (9)

Figure 1: Preliminary study on the role of graph convolution for recommendation in training and testing stages. The MF model with graph convolution after training (MF-conv) achieves competitive results with the LightGCN-conv.
Figure 2: A comparison of alignment force in GCN-based and MF-based models from BFS and DFS, respectively.
Figure 3: Study of the trade-off of embedding discrepancy and high-order information on Beauty and Toys-and-Games.
Figure 4: The training pipeline of traditional GCN-based recommendation and our proposed LightGODE with post-training graph convolution (PTGC) framework, where we skip the time-consuming convolution-related operations to speed up the training. In the PTGC stage, the self-loop prioritizes the shallow layers by weighing more on preceding layer representations, thus mitigating the distribution discrepancy problem. Based on the design of discrete non-parametric GCN, we derive LightGODE, a continuous ODE function that implements fine-grained graph convolution to achieve the optimal trade-off in the GCN design.
Figure 5: Trade-off between the performance and the efficiency on the Gowalla dataset. The left upper direction indicates stronger performance and more efficient training.
...and 4 more figures

Theorems & Definitions (1)

definition 1: Perfect Alignment

Do We Really Need Graph Convolution During Training? Light Post-Training Graph-ODE for Efficient Recommendation

TL;DR

Abstract

Do We Really Need Graph Convolution During Training? Light Post-Training Graph-ODE for Efficient Recommendation

Authors

TL;DR

Abstract

Table of Contents

Figures (9)

Theorems & Definitions (1)