You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs Tickets
Tianjin Huang, Tianlong Chen, Meng Fang, Vlado Menkovski, Jiaxu Zhao, Lu Yin, Yulong Pei, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy, Shiwei Liu
TL;DR
The paper addresses whether untrained sparse subnetworks can match fully trained dense GNNs by introducing Untrained GNNs Tickets (UGTs), a global-plus-gradual sparsification pipeline that discovers subnetworks inside randomly initialized GNNs without weight updates. It formalizes masks, scores, and a sparsity schedule, and demonstrates that subnetworks can be found up to sparsities as high as $s_f \approx 0.99$, achieving competitive accuracy across GCN, GIN, and GAT on datasets including eight small graphs and large-scale OGBN-Arxiv. The results show that these untrained subnetworks mitigate over-smoothing in deep GNNs, preserve feature distinctions (as evidenced by MAD and TSNE), and exhibit strong OOD detection and robustness to perturbations, often outperforming Edge-Popup. The findings point to a new direction where performant GNNs can be obtained by identifying suitable untrained subnetworks within randomly weighted architectures, enabling deeper, more scalable models without weight optimization.
Abstract
Recent works have impressively demonstrated that there exists a subnetwork in randomly initialized convolutional neural networks (CNNs) that can match the performance of the fully trained dense networks at initialization, without any optimization of the weights of the network (i.e., untrained networks). However, the presence of such untrained subnetworks in graph neural networks (GNNs) still remains mysterious. In this paper we carry out the first-of-its-kind exploration of discovering matching untrained GNNs. With sparsity as the core tool, we can find \textit{untrained sparse subnetworks} at the initialization, that can match the performance of \textit{fully trained dense} GNNs. Besides this already encouraging finding of comparable performance, we show that the found untrained subnetworks can substantially mitigate the GNN over-smoothing problem, hence becoming a powerful tool to enable deeper GNNs without bells and whistles. We also observe that such sparse untrained subnetworks have appealing performance in out-of-distribution detection and robustness of input perturbations. We evaluate our method across widely-used GNN architectures on various popular datasets including the Open Graph Benchmark (OGB).
