Table of Contents
Fetching ...

BrainHGT: A Hierarchical Graph Transformer for Interpretable Brain Network Analysis

Jiajun Ma, Yongchao Zhang, Chao Zhang, Zhao Lv, Shengbing Pei

TL;DR

BrainHGT introduces a hierarchical Graph Transformer that emulates the brain's local-to-global information processing by integrating a long-short range attention encoder with a prior-guided clustering module. The LSRA component balances dense local interactions and sparse global connections, while the Dice-prior-guided clustering enforces neuroanatomically plausible functional communities. Evaluations on ABIDE and ADNI show superior classification performance and the ability to reveal interpretable, disease-relevant community structures, including biomarkers consistent with existing neuroscience literature. The approach advances brain network analysis by offering multi-scale, interpretable insights with robustness to atlas choices, enabling more reliable biomarker discovery and clinical translation.

Abstract

Graph Transformer shows remarkable potential in brain network analysis due to its ability to model graph structures and complex node relationships. Most existing methods typically model the brain as a flat network, ignoring its modular structure, and their attention mechanisms treat all brain region connections equally, ignoring distance-related node connection patterns. However, brain information processing is a hierarchical process that involves local and long-range interactions between brain regions, interactions between regions and sub-functional modules, and interactions among functional modules themselves. This hierarchical interaction mechanism enables the brain to efficiently integrate local computations and global information flow, supporting the execution of complex cognitive functions. To address this issue, we propose BrainHGT, a hierarchical Graph Transformer that simulates the brain's natural information processing from local regions to global communities. Specifically, we design a novel long-short range attention encoder that utilizes parallel pathways to handle dense local interactions and sparse long-range connections, thereby effectively alleviating the over-globalizing issue. To further capture the brain's modular architecture, we designe a prior-guided clustering module that utilizes a cross-attention mechanism to group brain regions into functional communities and leverage neuroanatomical prior to guide the clustering process, thereby improving the biological plausibility and interpretability. Experimental results indicate that our proposed method significantly improves performance of disease identification, and can reliably capture the sub-functional modules of the brain, demonstrating its interpretability.

BrainHGT: A Hierarchical Graph Transformer for Interpretable Brain Network Analysis

TL;DR

BrainHGT introduces a hierarchical Graph Transformer that emulates the brain's local-to-global information processing by integrating a long-short range attention encoder with a prior-guided clustering module. The LSRA component balances dense local interactions and sparse global connections, while the Dice-prior-guided clustering enforces neuroanatomically plausible functional communities. Evaluations on ABIDE and ADNI show superior classification performance and the ability to reveal interpretable, disease-relevant community structures, including biomarkers consistent with existing neuroscience literature. The approach advances brain network analysis by offering multi-scale, interpretable insights with robustness to atlas choices, enabling more reliable biomarker discovery and clinical translation.

Abstract

Graph Transformer shows remarkable potential in brain network analysis due to its ability to model graph structures and complex node relationships. Most existing methods typically model the brain as a flat network, ignoring its modular structure, and their attention mechanisms treat all brain region connections equally, ignoring distance-related node connection patterns. However, brain information processing is a hierarchical process that involves local and long-range interactions between brain regions, interactions between regions and sub-functional modules, and interactions among functional modules themselves. This hierarchical interaction mechanism enables the brain to efficiently integrate local computations and global information flow, supporting the execution of complex cognitive functions. To address this issue, we propose BrainHGT, a hierarchical Graph Transformer that simulates the brain's natural information processing from local regions to global communities. Specifically, we design a novel long-short range attention encoder that utilizes parallel pathways to handle dense local interactions and sparse long-range connections, thereby effectively alleviating the over-globalizing issue. To further capture the brain's modular architecture, we designe a prior-guided clustering module that utilizes a cross-attention mechanism to group brain regions into functional communities and leverage neuroanatomical prior to guide the clustering process, thereby improving the biological plausibility and interpretability. Experimental results indicate that our proposed method significantly improves performance of disease identification, and can reliably capture the sub-functional modules of the brain, demonstrating its interpretability.

Paper Structure

This paper contains 33 sections, 11 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: The hierarchical model of brain information processing, showing three levels of interaction.
  • Figure 2: The overall framework of the proposed BrainHGT method. The model first learns multi-scale features of brain regions via a long-short range graph Transformer, and then aggregates these features into biologically plausible functional communities using a prior-guided clustering module for the final classification task.
  • Figure 3: Visualizing attention score from the long-short range attention module.
  • Figure 4: (a) Comparison of our learned functional communities with the standard Yeo 7-network. (b) Visualizing soft cluster assignment matrix and Dice prior matrix. (c) Differential community interactions highlight hyper-connectivity (left) and hypo-connectivity (right) in ASD.
  • Figure 5: The performance of different hop value on two datasets.
  • ...and 2 more figures