Population Graph Cross-Network Node Classification for Autism Detection Across Sample Groups

Anna Stephens; Francisco Santos; Pang-Ning Tan; Abdol-Hossein Esfahanian

Population Graph Cross-Network Node Classification for Autism Detection Across Sample Groups

Anna Stephens, Francisco Santos, Pang-Ning Tan, Abdol-Hossein Esfahanian

TL;DR

This work tackles cross-network node classification for autism detection across multi-site data, where domain drift impedes transfer from labeled source graphs to unlabeled targets. It introduces OTGCN, a hybrid framework that fuses graph convolutional networks with optimal transport to align source and target representations, aided by a nonlinear node feature transformation layer to handle low graph homophily. The method jointly optimizes classification loss and OT-based alignment, using Sinkhorn distance and barycentric mapping to transport source embeddings into the target domain. Empirical results on the ABIDE dataset show OTGCN outperforms state-of-the-art CNNC baselines, demonstrating robust cross-site ASD detection using combined imaging and non-imaging information. This approach provides a practical, scalable solution for domain-adaptive graph-based medical inference across diverse data collection environments.

Abstract

Graph neural networks (GNN) are a powerful tool for combining imaging and non-imaging medical information for node classification tasks. Cross-network node classification extends GNN techniques to account for domain drift, allowing for node classification on an unlabeled target network. In this paper we present OTGCN, a powerful, novel approach to cross-network node classification. This approach leans on concepts from graph convolutional networks to harness insights from graph data structures while simultaneously applying strategies rooted in optimal transport to correct for the domain drift that can occur between samples from different data collection sites. This blended approach provides a practical solution for scenarios with many distinct forms of data collected across different locations and equipment. We demonstrate the effectiveness of this approach at classifying Autism Spectrum Disorder subjects using a blend of imaging and non-imaging data.

Population Graph Cross-Network Node Classification for Autism Detection Across Sample Groups

TL;DR

Abstract

Paper Structure (17 sections, 12 equations, 4 figures, 3 tables)

This paper contains 17 sections, 12 equations, 4 figures, 3 tables.

Introduction
Related Works
Machine Learning Approaches to ASD Detection
Machine Learning on Multi-site fMRI Data
Cross-Network Node Classification (CNNC)
Preliminaries
Problem Statement
Graph Convolutional Network (GCN)
Optimal Transport
Methodology
Initial Node Embedding Construction & Pretraining
Domain Adaptation via Optimal Transport GCN
Experimental Evaluation
Data Preparation
Experimental Setup
...and 2 more sections

Figures (4)

Figure 1: An example illustrating the perils of failing to account for concept drift in multi-source data. Assume a binary node classification task where the two classes are represented as x's and o's, respectively. The blue points denote the dataset for the source domain while the orange points denote the dataset for the target domain. Observe that a decision boundary constructed from the blue dataset will have reduced accuracy when applied to the orange dataset due to concept drift between the source and target networks.
Figure 2: Illustration of the cost and transport plan matrices of optimal transport
Figure 3: High level architecture of our model. Combined source and target datasets are fed to two GCN layers and two NFT layers to learn two distinct embeddings. In pretraining (left) these embeddings are concatenated and sent directly to a fully connected classifier. Model weights are then updated using just cross entropy loss. In OT training (right) concatenated embeddings are routed to an OT layer before being sent to the classifier. The OT layer replaces the source embedding with a transported version and then sends the target embedding and the transported source embedding to the classifier. At this point the model weights are updated with a combination of cross entropy and OT losses.
Figure 4: TSNE plot of combined embeddings before and after transport in the first pass of our OT layer. Points represent subjects and the lines are potential decision boundaries based off from each source dataset. The light colors represent the source dataset, before transport on the left and after transport on the right. The dark colors represent the target datset, which is the same in both cases. The green circle points out a portion of the target group which will likely be incorrectly classified before transportation occurs. The green arrow points to that same group on the right.

Theorems & Definitions (1)

Definition 1: Cross Network Node Classification (CNNC)

Population Graph Cross-Network Node Classification for Autism Detection Across Sample Groups

TL;DR

Abstract

Population Graph Cross-Network Node Classification for Autism Detection Across Sample Groups

Authors

TL;DR

Abstract

Table of Contents

Figures (4)

Theorems & Definitions (1)