Nonparametric Teaching for Graph Property Learners
Chen Zhang, Weixin Bu, Zeyi Ren, Zhengwu Liu, Yik-Chung Wu, Ngai Wong
TL;DR
Graph property learners incur high training costs due to learning the implicit mapping $f^*:\mathbb{G}\to\mathcal{Y}$ from graphs to properties. GraNT reframes this as nonparametric teaching, showing that parameter-space updates for a graph neural network induce a functional gradient flow in an RKHS, with a dynamic Graph Neural Tangent Kernel $K_{\theta^t}$ that converges to the structure-aware canonical kernel $K$. The paper provides theory linking structure-aware parameter updates to the nonparametric teaching paradigm, and introduces a greedy GraNT algorithm that selects graphs with large gradient impact to accelerate convergence. Empirically, GraNT achieves substantial training-time savings (e.g., reductions of $-36.62\%$ to $-47.30\%$ across graph- and node-level tasks) while preserving or improving generalization, across multiple graph datasets and architectures. This work broadens nonparametric teaching to graph-structured data and offers a practical path to faster graph-property learning in domains like chemistry and biology.
Abstract
Inferring properties of graph-structured data, e.g., the solubility of molecules, essentially involves learning the implicit mapping from graphs to their properties. This learning process is often costly for graph property learners like Graph Convolutional Networks (GCNs). To address this, we propose a paradigm called Graph Neural Teaching (GraNT) that reinterprets the learning process through a novel nonparametric teaching perspective. Specifically, the latter offers a theoretical framework for teaching implicitly defined (i.e., nonparametric) mappings via example selection. Such an implicit mapping is realized by a dense set of graph-property pairs, with the GraNT teacher selecting a subset of them to promote faster convergence in GCN training. By analytically examining the impact of graph structure on parameter-based gradient descent during training, and recasting the evolution of GCNs--shaped by parameter updates--through functional gradient descent in nonparametric teaching, we show for the first time that teaching graph property learners (i.e., GCNs) is consistent with teaching structure-aware nonparametric learners. These new findings readily commit GraNT to enhancing learning efficiency of the graph property learner, showing significant reductions in training time for graph-level regression (-36.62%), graph-level classification (-38.19%), node-level regression (-30.97%) and node-level classification (-47.30%), all while maintaining its generalization performance.
