HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks

Yongyi Yang; Jiaming Yang; Wei Hu; Michał Dereziński

HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks

Yongyi Yang, Jiaming Yang, Wei Hu, Michał Dereziński

TL;DR

HerTA is proposed: a High-Efficiency and Rigorous Training Algorithm for Unfolded GNNs that accelerates the whole training process, achieving a nearly-linear time worst-case training guarantee and preserving the interpretability of Unfolded GNNs.

Abstract

As a variant of Graph Neural Networks (GNNs), Unfolded GNNs offer enhanced interpretability and flexibility over traditional designs. Nevertheless, they still suffer from scalability challenges when it comes to the training cost. Although many methods have been proposed to address the scalability issues, they mostly focus on per-iteration efficiency, without worst-case convergence guarantees. Moreover, those methods typically add components to or modify the original model, thus possibly breaking the interpretability of Unfolded GNNs. In this paper, we propose HERTA: a High-Efficiency and Rigorous Training Algorithm for Unfolded GNNs that accelerates the whole training process, achieving a nearly-linear time worst-case training guarantee. Crucially, HERTA converges to the optimum of the original model, thus preserving the interpretability of Unfolded GNNs. Additionally, as a byproduct of HERTA, we propose a new spectral sparsification method applicable to normalized and regularized graph Laplacians that ensures tighter bounds for our algorithm than existing spectral sparsifiers do. Experiments on real-world datasets verify the superiority of HERTA as well as its adaptability to various loss functions and optimizers.

HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks

TL;DR

Abstract

Paper Structure (45 sections, 14 theorems, 94 equations, 6 figures, 2 algorithms)

This paper contains 45 sections, 14 theorems, 94 equations, 6 figures, 2 algorithms.

Introduction
Related Work
Preliminaries
Problem Setting
Algorithm and Analysis
Analysis of TWIRLS Training
Key Techniques
SDD Solvers.
Fast Matrix Multiplication.
Regularized Spectral Sparsifier
Main Algorithm
Constructing the Preconditioner.
Solving the Outer Problem.
Experiments
Convergence Rate Comparison Under MSE Loss
...and 30 more sections

Key Result

Theorem 1.1

HERTA solves the $\lambda$-regularized Unfolded GNN objective eq:bilevel-outer with $n$ nodes, $m$ edges and $d$-dimensional node features to within accuracy $\epsilon$ in time $\tilde{O}\left( (m+nd) \left( \log \frac{1}{\epsilon}\right)^2 + d^3\right)$ as long as the number of large eigenvalues of

Figures (6)

Figure 1: The training loss comparison between HERTA and standard optimizers on MSE loss with $\lambda = 1$. Dataset used from left to right: ogbn-arxiv, citeseer, pubmed.
Figure 2: The training loss comparison between HERTA and standard optimizers on cross entropy loss with $\lambda = 1$. Dataset used from left to right: ogbn-arxiv, citeseer and pubmed.
Figure 3: The training loss comparison between HERTA and standard optimizers on MSE loss with $\lambda = 20$. Dataset used from left to right: ogbn-arxiv, citeseer, pubmed.
Figure 4: The training loss comparison between HERTA and standard optimizers on CE loss with $\lambda = 20$. Dataset used from left to right: ogbn-arxiv, citeseer, pubmed.
Figure 5: The training loss comparison between HERTA and standard optimizers on Cora with $\lambda = 1$. Left: CE loss. Right: MSE loss.
...and 1 more figures

Theorems & Definitions (20)

Theorem 1.1: Informal version of \ref{['thm:main']}
Definition 5.1: Effective Laplacian dimension
Theorem 5.1: Main result
Definition 5.2: Linear solver
Lemma 5.1: Convergence
Lemma 5.2: Regularized spectral sparsifier
Lemma 5.3: Preconditioner
Lemma 5.4: Well-conditioned Hessian
Lemma 1.1: sdd-solver
Lemma 1.2: srht_tropp
...and 10 more

HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks

TL;DR

Abstract

HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (20)