Table of Contents
Fetching ...

Joint Self-Supervised and Supervised Contrastive Learning for Multimodal MRI Data: Towards Predicting Abnormal Neurodevelopment

Zhiyuan Li, Hailong Li, Anca L. Ralescu, Jonathan R. Dillman, Mekibib Altaye, Kim M. Cecil, Nehal A. Parikh, Lili He

TL;DR

This work presents a novel joint self-supervised and supervised contrastive learning method to learn the robust latent feature representation from multimodal MRI data, allowing the projection of heterogeneous features into a shared common space, and thereby amalgamating both complementary and analogous information across various modalities and among similar subjects.

Abstract

The integration of different imaging modalities, such as structural, diffusion tensor, and functional magnetic resonance imaging, with deep learning models has yielded promising outcomes in discerning phenotypic characteristics and enhancing disease diagnosis. The development of such a technique hinges on the efficient fusion of heterogeneous multimodal features, which initially reside within distinct representation spaces. Naively fusing the multimodal features does not adequately capture the complementary information and could even produce redundancy. In this work, we present a novel joint self-supervised and supervised contrastive learning method to learn the robust latent feature representation from multimodal MRI data, allowing the projection of heterogeneous features into a shared common space, and thereby amalgamating both complementary and analogous information across various modalities and among similar subjects. We performed a comparative analysis between our proposed method and alternative deep multimodal learning approaches. Through extensive experiments on two independent datasets, the results demonstrated that our method is significantly superior to several other deep multimodal learning methods in predicting abnormal neurodevelopment. Our method has the capability to facilitate computer-aided diagnosis within clinical practice, harnessing the power of multimodal data.

Joint Self-Supervised and Supervised Contrastive Learning for Multimodal MRI Data: Towards Predicting Abnormal Neurodevelopment

TL;DR

This work presents a novel joint self-supervised and supervised contrastive learning method to learn the robust latent feature representation from multimodal MRI data, allowing the projection of heterogeneous features into a shared common space, and thereby amalgamating both complementary and analogous information across various modalities and among similar subjects.

Abstract

The integration of different imaging modalities, such as structural, diffusion tensor, and functional magnetic resonance imaging, with deep learning models has yielded promising outcomes in discerning phenotypic characteristics and enhancing disease diagnosis. The development of such a technique hinges on the efficient fusion of heterogeneous multimodal features, which initially reside within distinct representation spaces. Naively fusing the multimodal features does not adequately capture the complementary information and could even produce redundancy. In this work, we present a novel joint self-supervised and supervised contrastive learning method to learn the robust latent feature representation from multimodal MRI data, allowing the projection of heterogeneous features into a shared common space, and thereby amalgamating both complementary and analogous information across various modalities and among similar subjects. We performed a comparative analysis between our proposed method and alternative deep multimodal learning approaches. Through extensive experiments on two independent datasets, the results demonstrated that our method is significantly superior to several other deep multimodal learning methods in predicting abnormal neurodevelopment. Our method has the capability to facilitate computer-aided diagnosis within clinical practice, harnessing the power of multimodal data.
Paper Structure (28 sections, 9 equations, 7 figures, 8 tables, 1 algorithm)

This paper contains 28 sections, 9 equations, 7 figures, 8 tables, 1 algorithm.

Figures (7)

  • Figure 1: Schematic diagram of the proposed deep multimodal contrastive network for early prediction of neurological deficits at 2 years corrected age. We first input 5 feature types from $N$ subjects into a feature extractor block to extract the 5 different feature embeddings. Next, we performed two contrastive learning tasks to enforce the model to learn the CMC features and CSS features. Finally, we fine-tuned the pre-trained network in a supervised learning manner to predict the risk of cognitive deficits.
  • Figure 2: The illustration of learning CMC features from the proposed method.
  • Figure 3: The illustration of learning CSS features from the proposed method.
  • Figure 4: The t-SNE visualization of different methods for prediction of cognitive deficits uses the network’s last hidden layer in latent feature space. (a) is the feature representation in the original space before model optimization (b) is the feature representation learned from our method, we used the last hidden layer in the downstream stage. (c-h) are feature representations learned from other competing methods.
  • Figure 5: The ROC curves of different competing methods. The AUC values are shown in the lower right of the figure.
  • ...and 2 more figures