Transfer Learning Under High-Dimensional Graph Convolutional Regression Model for Node Classification
Jiachen Chen, Danyang Huang, Liyuan Wang, Kathryn L. Lunetta, Debarghya Mukherjee, Huimin Cheng
TL;DR
This work tackles node classification on graphs with high-dimensional covariates by introducing a theory-guided Graph Convolutional Multinomial Logistic Regression (GCR) and a transfer-learning framework, Trans-GCR, to leverage labeled data from related sources. GCR models class probabilities via multinomial logistic regression on graph-aggregated features across M layers, with sparse class-specific coefficients and a two-step transfer process that first learns source parameters and then adjusts for domain shift in the target. The authors provide high-dimensional convergence guarantees for the estimator under Erdős–Rényi graph assumptions and demonstrate strong empirical performance: Trans-GCR often surpasses GCR and naive pooling with lower computational cost and only two hyperparameters, with a practical transferable-source-detection component achieving high AUC in selecting useful sources. Real-data experiments on citation networks corroborate improved target-task results and efficiency relative to baselines such as AdaGCN. Overall, the work offers a scalable, theoretically grounded approach to graph-based transfer learning in high dimensions, with broad implications for domains where labeled data are scarce but related labeled sources exist.
Abstract
Node classification is a fundamental task, but obtaining node classification labels can be challenging and expensive in many real-world scenarios. Transfer learning has emerged as a promising solution to address this challenge by leveraging knowledge from source domains to enhance learning in a target domain. Existing transfer learning methods for node classification primarily focus on integrating Graph Convolutional Networks (GCNs) with various transfer learning techniques. While these approaches have shown promising results, they often suffer from a lack of theoretical guarantees, restrictive conditions, and high sensitivity to hyperparameter choices. To overcome these limitations, we propose a Graph Convolutional Multinomial Logistic Regression (GCR) model and a transfer learning method based on the GCR model, called Trans-GCR. We provide theoretical guarantees of the estimate obtained under GCR model in high-dimensional settings. Moreover, Trans-GCR demonstrates superior empirical performance, has a low computational cost, and requires fewer hyperparameters than existing methods.
