Table of Contents
Fetching ...

Enhanced MRI Representation via Cross-series Masking

Churan Wang, Fei Gao, Lijun Yan, Siwen Wang, Yizhou Yu, Yizhou Wang

TL;DR

This work tackles the challenge of learning robust representations from multi-series MRI data in the absence of extensive annotations. It introduces Cross-Series Masking (CSM), a self-supervised strategy that employs intra-series masking and inter-series masking to train a ViT-based encoder via reconstruction, enabling fusion of information across series. The learned representations achieve state-of-the-art results on brain tumor segmentation and improve breast MRI and prostate cancer diagnosis, even with limited labeled data, highlighting strong potential for clinical deployment. By leveraging large unlabeled multi-series datasets, CSM reduces annotation costs while delivering performance gains across diverse downstream tasks.

Abstract

Magnetic resonance imaging (MRI) is indispensable for diagnosing and planning treatment in various medical conditions due to its ability to produce multi-series images that reveal different tissue characteristics. However, integrating these diverse series to form a coherent analysis presents significant challenges, such as differing spatial resolutions and contrast patterns meanwhile requiring extensive annotated data, which is scarce in clinical practice. Due to these issues, we introduce a novel Cross-Series Masking (CSM) Strategy for effectively learning MRI representation in a self-supervised manner. Specifically, CSM commences by randomly sampling a subset of regions and series, which are then strategically masked. In the training process, the cross-series representation is learned by utilizing the unmasked data to reconstruct the masked portions. This process not only integrates information across different series but also facilitates the ability to model both intra-series and inter-series correlations and complementarities. With the learned representation, the downstream tasks like segmentation and classification are also enhanced. Taking brain tissue segmentation, breast tumor benign/malignant classification, and prostate cancer diagnosis as examples, our method achieves state-of-the-art performance on both public and in-house datasets.

Enhanced MRI Representation via Cross-series Masking

TL;DR

This work tackles the challenge of learning robust representations from multi-series MRI data in the absence of extensive annotations. It introduces Cross-Series Masking (CSM), a self-supervised strategy that employs intra-series masking and inter-series masking to train a ViT-based encoder via reconstruction, enabling fusion of information across series. The learned representations achieve state-of-the-art results on brain tumor segmentation and improve breast MRI and prostate cancer diagnosis, even with limited labeled data, highlighting strong potential for clinical deployment. By leveraging large unlabeled multi-series datasets, CSM reduces annotation costs while delivering performance gains across diverse downstream tasks.

Abstract

Magnetic resonance imaging (MRI) is indispensable for diagnosing and planning treatment in various medical conditions due to its ability to produce multi-series images that reveal different tissue characteristics. However, integrating these diverse series to form a coherent analysis presents significant challenges, such as differing spatial resolutions and contrast patterns meanwhile requiring extensive annotated data, which is scarce in clinical practice. Due to these issues, we introduce a novel Cross-Series Masking (CSM) Strategy for effectively learning MRI representation in a self-supervised manner. Specifically, CSM commences by randomly sampling a subset of regions and series, which are then strategically masked. In the training process, the cross-series representation is learned by utilizing the unmasked data to reconstruct the masked portions. This process not only integrates information across different series but also facilitates the ability to model both intra-series and inter-series correlations and complementarities. With the learned representation, the downstream tasks like segmentation and classification are also enhanced. Taking brain tissue segmentation, breast tumor benign/malignant classification, and prostate cancer diagnosis as examples, our method achieves state-of-the-art performance on both public and in-house datasets.

Paper Structure

This paper contains 13 sections, 3 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: Examples of using multiple series for clinical diagnosis, take breast malignant diagnosis as an example. (a) The upper case is malignant (high intensity in T2w and DWI, oval shape and lobulated margin in T1ce) and the lower case is benign (low intensity in T2w, high intensity DWI, oval shape and circumscribed margin in T1ce), demonstrating the three series together with a close-up view of the lesion and its surrounding area. Only considering diagnostic attributes in a single series can not diagnose lesions. (b) The effectiveness of using multiple series is much higher than that of using a single series. Our method is state-of-the-art.
  • Figure 2: The overview of our proposed method for learning MRI representation via cross-series masking. It begins with multiple MRI series inputs, which undergo random series and patch masking. The remaining unmasked regions are then encoded by a ViT Encoder and learned to reconstruct the masked series and patches. The well-trained encoder can model the correlations between multiple series and extract the cross-series representation. Based on this, the performance of other downstream tasks can be enhanced without increasing the number of training data.
  • Figure 3: The strategy of our proposed CSM method. (a) Intra-series masking, which randomly masks a substantial proportion of the patches within each series respectively. Adopting the intra-series masking strategy, the model can not only leverage the context from both inter-series and intra-series but also learn the inter-series complementarities. (b) Inter-series masking, which randomly fully masks a subset of the series. This strategy forces the model to learn the complementarity and relationship in a global view.
  • Figure 4: Examples from the datasets we use in our study. Breast2023 is our in-house breast MRI dataset, which contains T1ce, DWI and T2w series for each patient. The lesion is delineated by yellow lines, and its malignancy is confirmed by pathological test. BT-MSD is a publicly available brain MRI tumor segmentation dataset with four modalities(T1w, T1ce, T2w, Flair), the tumor region area is delineated by experts. Prostate158 is a public dataset about prostate MRI. Each patient includes three series - T2w, DWI, and ADC, with the cancerous area marked by the doctor (the part within the yellow curve).
  • Figure 5: Examples of class active maps of different series in Breast2023 and Prostate under multi-series classification.
  • ...and 1 more figures