CATD: Unified Representation Learning for EEG-to-fMRI Cross-Modal Generation

Weiheng Yao; Zhihan Lyu; Mufti Mahmud; Ning Zhong; Baiying Lei; Shuqiang Wang

CATD: Unified Representation Learning for EEG-to-fMRI Cross-Modal Generation

Weiheng Yao, Zhihan Lyu, Mufti Mahmud, Ning Zhong, Baiying Lei, Shuqiang Wang

TL;DR

The paper tackles the challenge of limited access to high-cost fMRI by proposing CATD, a diffusion-based framework that synthesizes BOLD signals from EEG through a Condition-Aligned Block (CAB) and Dynamic Time-Frequency Segmentation (DTFS). By aligning heterogeneous EEG and BOLD data in a unified latent space and leveraging EEG-conditioned diffusion with transformer cross-attention, CATD achieves cross-modal synthesis and temporal super-resolution of BOLD signals. Experimental results across motor imagery, resting-state, and Parkinson's datasets show improvements in spatial-temporal fidelity, with CAB contributing to ablation gains and the approach enabling medical decision support via enhanced diagnostic signals. The work suggests significant potential for non-invasive, cost-effective functional neuroimaging and lays groundwork for broader clinical applications and future real-time extensions.

Abstract

Multi-modal neuroimaging analysis is crucial for a comprehensive understanding of brain function and pathology, as it allows for the integration of different imaging techniques, thus overcoming the limitations of individual modalities. However, the high costs and limited availability of certain modalities pose significant challenges. To address these issues, this paper proposes the Condition-Aligned Temporal Diffusion (CATD) framework for end-to-end cross-modal synthesis of neuroimaging, enabling the generation of functional magnetic resonance imaging (fMRI)-detected Blood Oxygen Level Dependent (BOLD) signals from more accessible Electroencephalography (EEG) signals. By constructing Conditionally Aligned Block (CAB), heterogeneous neuroimages are aligned into a latent space, achieving a unified representation that provides the foundation for cross-modal transformation in neuroimaging. The combination with the constructed Dynamic Time-Frequency Segmentation (DTFS) module also enables the use of EEG signals to improve the temporal resolution of BOLD signals, thus augmenting the capture of the dynamic details of the brain. Experimental validation demonstrates that the framework improves the accuracy of brain activity state prediction by 9.13% (reaching 69.8%), enhances the diagnostic accuracy of brain disorders by 4.10% (reaching 99.55%), effectively identifies abnormal brain regions, enhancing the temporal resolution of BOLD signals. The proposed framework establishes a new paradigm for cross-modal synthesis of neuroimaging by unifying heterogeneous neuroimaging data into a latent representation space, showing promise in medical applications such as improving Parkinson's disease prediction and identifying abnormal brain regions.

CATD: Unified Representation Learning for EEG-to-fMRI Cross-Modal Generation

TL;DR

Abstract

Paper Structure (19 sections, 4 equations, 9 figures, 1 table, 2 algorithms)

This paper contains 19 sections, 4 equations, 9 figures, 1 table, 2 algorithms.

Introduction
Method
Overview
Data Alignment Method
EEG to BOLD Diffusion
Basic ideas
Architectures
Loss Function
Experiment
Experiment Settings
Dataset
Implementation Detail
Metrics
Evaluation of the generated BOLD signal
Evaluation of temporal resolution enhanced BOLD signals
...and 4 more sections

Figures (9)

Figure 1: Comparison of Advantages and Disadvantages of BOLD fMRI and EEG
Figure 2: The overall framework of the proposed CATD. The upper part of the figure shows how two different dimensions of data are processed to achieve the initial alignment. The lower half shows the generation pipeline of the BOLD signal under the control of the EEG condition based on the DiT structure.
Figure 3: (a) Radar plot comparing real BOLD signals, generated BOLD signals, their combination, and ablation results of conditioned blocks for motor imagery and resting state classification. (b) Radar plot of the prediction results of motor imagery and rest state for the real BOLD signal, the data generated by our method, and the data generated by the compared AE-based method. (c) Radar plot of BOLD signals synthesized from different EEG frequency bands in motor imagery and resting states using the proposed CATD framework. (d) Radar plot of BOLD signals synthesized from real EEG signals, and their combination, for predicting Parkinson’s disease in a clinical decision support experiment.
Figure 4: Results of quantitative spatial and temporal metrics for synthetic BOLD signals in three states of motor imagery from ablation experiments. The upper half shows spatial metrics, and the lower half shows temporal metrics, with light green indicating use of CAB and teal indicating no CAB.
Figure 5: (a) t-SNE plots of synthesized versus real BOLD signal distributions in the CATD framework across two motor imagery states. (b) t-SNE plot of generated versus real BOLD signal distributions after ablation of the CAB. (c) t-SNE plots of BOLD signal distributions generated by the comparison method versus real signals.
...and 4 more figures

CATD: Unified Representation Learning for EEG-to-fMRI Cross-Modal Generation

TL;DR

Abstract

CATD: Unified Representation Learning for EEG-to-fMRI Cross-Modal Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (9)