Structural Alignment Improves Graph Test-Time Adaptation
Hans Hao-Hsun Hsu, Shikun Liu, Han Zhao, Pan Li
TL;DR
Graph neural networks degrade under distribution shifts that alter neighborhood connectivity. The authors propose Test-Time Structural Alignment (TSA), a GTTA method that adapts a pretrained GNN at inference by (i) neighborhood alignment via a $\boldsymbol{\gamma}$ matrix to correct conditional structure shifts, (ii) SNR-driven weighting to blend self and neighbor representations by layer, and (iii) boundary refinement using non-graph TTA signals. The approach is theoretically grounded, yielding an upper bound on the GTTA error gap that decomposes into label shift, neighborhood shift, and feature shift, and empirically TSA outperforms both non-graph TTA and existing GTTA baselines across diverse datasets and backbones. TSA is lightweight, model-agnostic, and effective under privacy and compute constraints, making practical GTTA feasible without retraining on source data.
Abstract
Graph-based learning excels at capturing interaction patterns in diverse domains like recommendation, fraud detection, and particle physics. However, its performance often degrades under distribution shifts, especially those altering network connectivity. Current methods to address these shifts typically require retraining with the source dataset, which is often infeasible due to computational or privacy limitations. We introduce Test-Time Structural Alignment (TSA), a novel algorithm for Graph Test-Time Adaptation (GTTA) that adapts a pretrained model to align graph structures during inference without the cost of retraining. Grounded in a theoretical understanding of graph data distribution shifts, TSA employs three synergistic strategies: uncertainty-aware neighborhood weighting to accommodate neighbor label distribution shifts, adaptive balancing of self-node and aggregated neighborhood representations based on their signal-to-noise ratio, and decision boundary refinement to correct residual label and feature shifts. Extensive experiments on synthetic and real-world datasets demonstrate TSA's consistent outperformance of both non-graph TTA methods and state-of-the-art GTTA baselines.
