Table of Contents
Fetching ...

Automated Detection of Multiple Sclerosis Lesions on 7-tesla MRI Using U-net and Transformer-based Segmentation

Michael Maynord, Minghui Liu, Cornelia Fermüller, Seongjin Choi, Yuxin Zeng, Shishir Dahal, Daniel M. Harrison

Abstract

Ultra-high field 7-tesla (7T) MRI improves visualization of multiple sclerosis (MS) white matter lesions (WML) but differs sufficiently in contrast and artifacts from 1.5-3T imaging - suggesting that widely used automated segmentation tools may not translate directly. We analyzed 7T FLAIR scans and generated reference WML masks from Lesion Segmentation Tool (LST) outputs followed by expert manual revision. As external comparators, we applied LST-LPA and the more recent LST-AI ensemble, both originally developed on lower-field data. We then trained 3D UNETR and SegFormer transformer-based models on 7T FLAIR at multiple resolutions (0.5x0.5x0.5^3, 1.0x1.0x1.0^3, and 1.5x1.5x2.0^3) and evaluated all methods using voxel-wise and lesion-wise metrics from the BraTS 2023 framework. On the held-out test set at native 0.5x0.5x0.5^3 resolution, 7T-trained transformers achieved competitive overlap with LST-AI while recovering additional small lesions that were missed by classical methods, at the cost of some boundary variability and occasional artifact-related false positives. On a held-out 7 T test set, our best transformer model (SegFormer) achieved a voxel-wise Dice of 0.61 and lesion-wise Dice of 0.20, improving on the classical LST-LPA tool (Dice 0.39, lesion-wise Dice 0.02). Performance decreased for models trained on downsampled images, underscoring the value of native 7T resolution for small-lesion detection. By releasing our 7T-trained models, we aim to provide a reproducible, ready-to-use resource for automated lesion quantification in ultra-high field MS research (https://github.com/maynord/7T-MS-lesion-segmentation).

Automated Detection of Multiple Sclerosis Lesions on 7-tesla MRI Using U-net and Transformer-based Segmentation

Abstract

Ultra-high field 7-tesla (7T) MRI improves visualization of multiple sclerosis (MS) white matter lesions (WML) but differs sufficiently in contrast and artifacts from 1.5-3T imaging - suggesting that widely used automated segmentation tools may not translate directly. We analyzed 7T FLAIR scans and generated reference WML masks from Lesion Segmentation Tool (LST) outputs followed by expert manual revision. As external comparators, we applied LST-LPA and the more recent LST-AI ensemble, both originally developed on lower-field data. We then trained 3D UNETR and SegFormer transformer-based models on 7T FLAIR at multiple resolutions (0.5x0.5x0.5^3, 1.0x1.0x1.0^3, and 1.5x1.5x2.0^3) and evaluated all methods using voxel-wise and lesion-wise metrics from the BraTS 2023 framework. On the held-out test set at native 0.5x0.5x0.5^3 resolution, 7T-trained transformers achieved competitive overlap with LST-AI while recovering additional small lesions that were missed by classical methods, at the cost of some boundary variability and occasional artifact-related false positives. On a held-out 7 T test set, our best transformer model (SegFormer) achieved a voxel-wise Dice of 0.61 and lesion-wise Dice of 0.20, improving on the classical LST-LPA tool (Dice 0.39, lesion-wise Dice 0.02). Performance decreased for models trained on downsampled images, underscoring the value of native 7T resolution for small-lesion detection. By releasing our 7T-trained models, we aim to provide a reproducible, ready-to-use resource for automated lesion quantification in ultra-high field MS research (https://github.com/maynord/7T-MS-lesion-segmentation).

Paper Structure

This paper contains 24 sections, 4 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Comparison of segmentation methods. Shown are 7T FLAIR 0.5$\times$0.5$\times$0.5 mm3 images from three participants (A, B, and C) in this study and corresponding segmentations performed and/or trained on the 0.5$\times$0.5$\times$0.5 mm3 images. Visual inspection highlights qualitative differences between segmentation outputs from each method. Manual (red) masks indicate the manually annotated image by an expert rater, which acted as the ‘gold standard’ to which quantitative comparisons were performed. LST LPA appears to erroneously highlight bright cortical boundaries, which are often prominent on 7T FLAIR, as lesion, in addition to choroid plexus (A) and septum pellucidum (B). LST AI avoids many of the errors performed by LST LPA, but misses some lesions (left frontal in A) and underfills others (left frontal in C). UNETR and Segformer performed relatively well compared to manual annotation, in addition to capturing small lesions missed by the manual rater (small right subcortical WML in B and C). However, both occasionally erroneously highlighted false positives due to bright cortical boundaries (left frontoparietal in C).
  • Figure 2: Handling of image artifacts. Shown are 7T FLAIR images (0.5$\times$0.5$\times$0.5 mm3) from one participant in this study in which a wraparound artifact, likely from ear tissue, occurs in the region of the pons and cerebellum. Manual annotation did not identify this as lesion, but LST LPA, LST AI, and UNETR masked this as a false positive lesion. Segformer did not.
  • Figure 3: Comparison of models trained at various image resolutions. Shown are 7T FLAIR images from a participant with MS at the original 0.5$\times$0.5$\times$0.5 mm3 resolution and downsampled to 1.0$\times$1.0$\times$1.0 mm3 and 1.5$\times$1.5$\times$2.0 mm3 resolution (first row), along with a manually drawn lesion mask (red). UNETR segmentation shown in yellow and Segformer segmentation shown in purple. The model trained on images and masks in the 0.5$\times$0.5$\times$0.5 mm3 resolution space appear closest in segmentation accuracy to the manual, ‘ground-truth’ masks, with a slight reduction in accuracy seen for downsampled 1.0$\times$1.0$\times$1.0 mm3. Downsampling to 1.5$\times$1.5$\times$2.0 mm3 clearly demonstrates segmentation inferiority to either isotropic option.