Table of Contents
Fetching ...

On the Effectiveness of Random Weights in Graph Neural Networks

Thu Bui, Carola-Bibiane Schönlieb, Bruno Ribeiro, Beatrice Bevilacqua, Moshe Eliasof

TL;DR

The paper investigates whether randomly weighted message passing in Graph Neural Networks can match the performance of fully trained models. It introduces RAP-GNN, which uses on-the-fly diagonal random weights in GNN layers and a pretrained, frozen embedding, training only a final classifier. The approach delivers substantial efficiency gains—training time up to 6x faster and memory usage up to 3x lower—while maintaining competitive accuracy across node, graph, and large-scale tasks. The authors also provide theoretical and empirical evidence that diagonal random weights mitigate feature rank collapse, framing random weights as effective random propagation operators, and offering a flexible symmetry profile between permutation-equivariant and permutation-sensitive behavior. This work supports resource-efficient graph learning with broad applicability to real-world, large-scale graph datasets.

Abstract

Graph Neural Networks (GNNs) have achieved remarkable success across diverse tasks on graph-structured data, primarily through the use of learned weights in message passing layers. In this paper, we demonstrate that random weights can be surprisingly effective, achieving performance comparable to end-to-end training counterparts, across various tasks and datasets. Specifically, we show that by replacing learnable weights with random weights, GNNs can retain strong predictive power, while significantly reducing training time by up to 6$\times$ and memory usage by up to 3$\times$. Moreover, the random weights combined with our construction yield random graph propagation operators, which we show to reduce the problem of feature rank collapse in GNNs. These understandings and empirical results highlight random weights as a lightweight and efficient alternative, offering a compelling perspective on the design and training of GNN architectures.

On the Effectiveness of Random Weights in Graph Neural Networks

TL;DR

The paper investigates whether randomly weighted message passing in Graph Neural Networks can match the performance of fully trained models. It introduces RAP-GNN, which uses on-the-fly diagonal random weights in GNN layers and a pretrained, frozen embedding, training only a final classifier. The approach delivers substantial efficiency gains—training time up to 6x faster and memory usage up to 3x lower—while maintaining competitive accuracy across node, graph, and large-scale tasks. The authors also provide theoretical and empirical evidence that diagonal random weights mitigate feature rank collapse, framing random weights as effective random propagation operators, and offering a flexible symmetry profile between permutation-equivariant and permutation-sensitive behavior. This work supports resource-efficient graph learning with broad applicability to real-world, large-scale graph datasets.

Abstract

Graph Neural Networks (GNNs) have achieved remarkable success across diverse tasks on graph-structured data, primarily through the use of learned weights in message passing layers. In this paper, we demonstrate that random weights can be surprisingly effective, achieving performance comparable to end-to-end training counterparts, across various tasks and datasets. Specifically, we show that by replacing learnable weights with random weights, GNNs can retain strong predictive power, while significantly reducing training time by up to 6 and memory usage by up to 3. Moreover, the random weights combined with our construction yield random graph propagation operators, which we show to reduce the problem of feature rank collapse in GNNs. These understandings and empirical results highlight random weights as a lightweight and efficient alternative, offering a compelling perspective on the design and training of GNN architectures.

Paper Structure

This paper contains 47 sections, 9 equations, 6 figures, 14 tables, 2 algorithms.

Figures (6)

  • Figure 1: Illustration of RAP-GNN. The pretrained embedding $f^{\text{pre}}_\phi$ is frozen, while the classifier $c_\theta$ is optimized. All GNN weights, $\mathbf{w}^{(l)}, \ l=1,\ldots,L$ are diagonal matrices, randomly sampled on-the-fly in each forward pass.
  • Figure 2: Training time (a), inference time (b), and accuracy (c) for node classification on PubMed with GCN backbone. The runtimes gap between our RAP-GNN and End-to-End widens as the number of layers increases, while obtaining superior accuracy than End-to-End. Accuracy results in (c) validate \ref{['theorem:thm']}, showing RAP-GNN maintains stability in deep architectures, unlike Random Weights.
  • Figure 3: GPU memory usage comparison on on the PubMed dataset: RAP-GNN requiring only a third of the memory compared to the End-to-End (GCN backbone).
  • Figure 4: Comparison of the mean of $\text{Var}(\bold{h}^{(l)})$ (in log scale) across layers for End-to-End and RAP-GNN on the PubMed dataset using the same GCN architecture with residual connections with $L =64$ and $d=256$. RAP-GNN increases embedding variance with higher rate compared to End-to-End as the number of layers grows, consistent with the theoretical result in \ref{['theorem:thm']}.
  • Figure 5: Weight matrices learned in an end-to-end trained GCN with residual connections are shown for layer 1 (a) and layer 64 (b), each sized $256 \times 256$. Notably, the learned matrices are not diagonal.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Definition 1: Node Embedding Rank