A Bi-Pyramid Multimodal Fusion Method for the Diagnosis of Bipolar Disorders
Guoxin Wang, Sheng Shi, Shan An, Fengmei Fan, Wenshu Ge, Qi Wang, Feng Yu, Zhiren Wang
TL;DR
This study tackles the challenge of objectively diagnosing bipolar disorder by leveraging multimodal MRI data. It introduces the bi-Pyramid Multimodal Fusion (BPM-Fusion) framework, combining a Patch Pyramid Feature Extraction Module (P2FEM) for sMRI and a Spatio-temporal Feature Aggregation Module (SFAM) for rs-fMRI, with a fusion classifier to output BD probabilities. Across the collected BD dataset and the OpenfMRI public dataset, BPM-Fusion achieves state-of-the-art balanced accuracy, notably improving performance when using both modalities versus single modalities, demonstrating the practical value of efficient multimodal integration for clinical neurodiagnostics. The work highlights the feasibility and effectiveness of end-to-end multimodal fusion in BD diagnosis and sets the stage for exploring alternative fusion strategies to further enhance diagnostic accuracy.
Abstract
Previous research on the diagnosis of Bipolar disorder has mainly focused on resting-state functional magnetic resonance imaging. However, their accuracy can not meet the requirements of clinical diagnosis. Efficient multimodal fusion strategies have great potential for applications in multimodal data and can further improve the performance of medical diagnosis models. In this work, we utilize both sMRI and fMRI data and propose a novel multimodal diagnosis model for bipolar disorder. The proposed Patch Pyramid Feature Extraction Module extracts sMRI features, and the spatio-temporal pyramid structure extracts the fMRI features. Finally, they are fused by a fusion module to output diagnosis results with a classifier. Extensive experiments show that our proposed method outperforms others in balanced accuracy from 0.657 to 0.732 on the OpenfMRI dataset, and achieves the state of the art.
