CATD: Unified Representation Learning for EEG-to-fMRI Cross-Modal Generation
Weiheng Yao, Zhihan Lyu, Mufti Mahmud, Ning Zhong, Baiying Lei, Shuqiang Wang
TL;DR
The paper tackles the challenge of limited access to high-cost fMRI by proposing CATD, a diffusion-based framework that synthesizes BOLD signals from EEG through a Condition-Aligned Block (CAB) and Dynamic Time-Frequency Segmentation (DTFS). By aligning heterogeneous EEG and BOLD data in a unified latent space and leveraging EEG-conditioned diffusion with transformer cross-attention, CATD achieves cross-modal synthesis and temporal super-resolution of BOLD signals. Experimental results across motor imagery, resting-state, and Parkinson's datasets show improvements in spatial-temporal fidelity, with CAB contributing to ablation gains and the approach enabling medical decision support via enhanced diagnostic signals. The work suggests significant potential for non-invasive, cost-effective functional neuroimaging and lays groundwork for broader clinical applications and future real-time extensions.
Abstract
Multi-modal neuroimaging analysis is crucial for a comprehensive understanding of brain function and pathology, as it allows for the integration of different imaging techniques, thus overcoming the limitations of individual modalities. However, the high costs and limited availability of certain modalities pose significant challenges. To address these issues, this paper proposes the Condition-Aligned Temporal Diffusion (CATD) framework for end-to-end cross-modal synthesis of neuroimaging, enabling the generation of functional magnetic resonance imaging (fMRI)-detected Blood Oxygen Level Dependent (BOLD) signals from more accessible Electroencephalography (EEG) signals. By constructing Conditionally Aligned Block (CAB), heterogeneous neuroimages are aligned into a latent space, achieving a unified representation that provides the foundation for cross-modal transformation in neuroimaging. The combination with the constructed Dynamic Time-Frequency Segmentation (DTFS) module also enables the use of EEG signals to improve the temporal resolution of BOLD signals, thus augmenting the capture of the dynamic details of the brain. Experimental validation demonstrates that the framework improves the accuracy of brain activity state prediction by 9.13% (reaching 69.8%), enhances the diagnostic accuracy of brain disorders by 4.10% (reaching 99.55%), effectively identifies abnormal brain regions, enhancing the temporal resolution of BOLD signals. The proposed framework establishes a new paradigm for cross-modal synthesis of neuroimaging by unifying heterogeneous neuroimaging data into a latent representation space, showing promise in medical applications such as improving Parkinson's disease prediction and identifying abnormal brain regions.
