Local Temporal Feature Enhanced Transformer with ROI-rank Based Masking for Diagnosis of ADHD

Byunggun Kim; Younghun Kwon

Local Temporal Feature Enhanced Transformer with ROI-rank Based Masking for Diagnosis of ADHD

Byunggun Kim, Younghun Kwon

TL;DR

This study tackles ADHD diagnosis from resting-state fMRI by learning full spatiotemporal biomarkers with an encoder–decoder transformer. It introduces three targeted innovations: a CNN-based embedding block for spatial features, local temporal attention to capture short-range BOLD dynamics, and ROI-rank masking to focus on the most ADHD-relevant regions, all within a spatiotemporal co-attention framework. Evaluated on ADHD-200 data across sites, the approach achieves about 77.8% ACC and 79.3% AUC, outperforming several CNN- and transformer-based baselines and demonstrating robustness to ROI templates. The work provides both improved diagnostic performance and interpretable biomarker patterns, advancing cross-site ADHD diagnosis and biomarker discovery from rs-fMRI.

Abstract

In modern society, Attention-Deficit/Hyperactivity Disorder (ADHD) is one of the common mental diseases discovered not only in children but also in adults. In this context, we propose a ADHD diagnosis transformer model that can effectively simultaneously find important brain spatiotemporal biomarkers from resting-state functional magnetic resonance (rs-fMRI). This model not only learns spatiotemporal individual features but also learns the correlation with full attention structures specialized in ADHD diagnosis. In particular, it focuses on learning local blood oxygenation level dependent (BOLD) signals and distinguishing important regions of interest (ROI) in the brain. Specifically, the three proposed methods for ADHD diagnosis transformer are as follows. First, we design a CNN-based embedding block to obtain more expressive embedding features in brain region attention. It is reconstructed based on the previously CNN-based ADHD diagnosis models for the transformer. Next, for individual spatiotemporal feature attention, we change the attention method to local temporal attention and ROI-rank based masking. For the temporal features of fMRI, the local temporal attention enables to learn local BOLD signal features with only simple window masking. For the spatial feature of fMRI, ROI-rank based masking can distinguish ROIs with high correlation in ROI relationships based on attention scores, thereby providing a more specific biomarker for ADHD diagnosis. The experiment was conducted with various types of transformer models. To evaluate these models, we collected the data from 939 individuals from all sites provided by the ADHD-200 competition. Through this, the spatiotemporal enhanced transformer for ADHD diagnosis outperforms the performance of other different types of transformer variants. (77.78ACC 76.60SPE 79.22SEN 79.30AUC)

Local Temporal Feature Enhanced Transformer with ROI-rank Based Masking for Diagnosis of ADHD

TL;DR

Abstract

Local Temporal Feature Enhanced Transformer with ROI-rank Based Masking for Diagnosis of ADHD

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)