Population Graph Cross-Network Node Classification for Autism Detection Across Sample Groups
Anna Stephens, Francisco Santos, Pang-Ning Tan, Abdol-Hossein Esfahanian
TL;DR
This work tackles cross-network node classification for autism detection across multi-site data, where domain drift impedes transfer from labeled source graphs to unlabeled targets. It introduces OTGCN, a hybrid framework that fuses graph convolutional networks with optimal transport to align source and target representations, aided by a nonlinear node feature transformation layer to handle low graph homophily. The method jointly optimizes classification loss and OT-based alignment, using Sinkhorn distance and barycentric mapping to transport source embeddings into the target domain. Empirical results on the ABIDE dataset show OTGCN outperforms state-of-the-art CNNC baselines, demonstrating robust cross-site ASD detection using combined imaging and non-imaging information. This approach provides a practical, scalable solution for domain-adaptive graph-based medical inference across diverse data collection environments.
Abstract
Graph neural networks (GNN) are a powerful tool for combining imaging and non-imaging medical information for node classification tasks. Cross-network node classification extends GNN techniques to account for domain drift, allowing for node classification on an unlabeled target network. In this paper we present OTGCN, a powerful, novel approach to cross-network node classification. This approach leans on concepts from graph convolutional networks to harness insights from graph data structures while simultaneously applying strategies rooted in optimal transport to correct for the domain drift that can occur between samples from different data collection sites. This blended approach provides a practical solution for scenarios with many distinct forms of data collected across different locations and equipment. We demonstrate the effectiveness of this approach at classifying Autism Spectrum Disorder subjects using a blend of imaging and non-imaging data.
