Table of Contents
Fetching ...

Learn2Reg 2024: New Benchmark Datasets Driving Progress on New Challenges

Lasse Hansen, Wiebke Heyer, Christoph Großbröhmer, Frederic Madesta, Thilo Sentker, Wang Jiazheng, Yuxi Zhang, Hang Zhang, Min Liu, Junyi Wang, Xi Zhu, Yuhua Li, Liwen Wang, Daniil Morozov, Nazim Haouchine, Joel Honkamaa, Pekka Marttinen, Yichao Zhou, Zuopeng Tan, Zhuoyuan Wang, Yi Wang, Hongchao Zhou, Shunbo Hu, Yi Zhang, Qian Tao, Lukas Förner, Thomas Wendler, Bailiang Jian, Christian Wachinger, Jin Kim, Dan Ruan, Marek Wodzinski, Henning Müller, Tony C. W. Mok, Xi Jia, Jinming Duan, Mikael Brudfors, Seyed-Ahmad Ahmadi, Yunzheng Zhu, William Hsu, Tina Kapur, William M. Wells, Alexandra Golby, Aaron Carass, Harrison Bai, Yihao Liu, Perrine Paul-Gilloteaux, Joakim Lindblad, Nataša Sladoje, Andreas Walter, Junyu Chen, Reuben Dorent, Alessa Hering, Mattias P. Heinrich

TL;DR

Learn2Reg 2024 expands medical image registration benchmarking with three new benchmark datasets—ReMIND2Reg, LUMIR, and COMULISglobe SHG/BF—alongside continued NLST work, addressing multimodal intraoperative registration, large-scale unsupervised brain MRI alignment, and microscopy-based registration. The challenge showcases a spectrum of approaches from classical optimization (e.g., NiftyReg, Greedy) to diverse deep learning strategies and foundation-model adaptations, revealing that performance strongly depends on data scale and task modality. Key findings include the competitiveness of classical methods on challenging multimodal tasks with limited data, the advantage of large-scale datasets for deep learning in brain MRI registration, and robust but fallible performance in histology and intraoperative scenarios. The outcomes highlight the need for generalizable, robust, and task-agnostic registration frameworks and call for broader cross-disciplinary participation to extend Learn2Reg’s benchmarking impact across biomedical imaging domains.

Abstract

Medical image registration is critical for clinical applications, and fair benchmarking of different methods is essential for monitoring ongoing progress in the field. To date, the Learn2Reg 2020-2023 challenges have released several complementary datasets and established metrics for evaluations. Building on this foundation, the 2024 edition expands the challenge's scope to cover a wider range of registration scenarios, particularly in terms of modality diversity and task complexity, by introducing three new tasks, including large-scale multi-modal registration and unsupervised inter-subject brain registration, as well as the first microscopy-focused benchmark within Learn2Reg. The new datasets also inspired new method developments, including invertibility constraints, pyramid features, keypoints alignment and instance optimisation. Visit Learn2Reg at https://learn2reg.grand-challenge.org.

Learn2Reg 2024: New Benchmark Datasets Driving Progress on New Challenges

TL;DR

Learn2Reg 2024 expands medical image registration benchmarking with three new benchmark datasets—ReMIND2Reg, LUMIR, and COMULISglobe SHG/BF—alongside continued NLST work, addressing multimodal intraoperative registration, large-scale unsupervised brain MRI alignment, and microscopy-based registration. The challenge showcases a spectrum of approaches from classical optimization (e.g., NiftyReg, Greedy) to diverse deep learning strategies and foundation-model adaptations, revealing that performance strongly depends on data scale and task modality. Key findings include the competitiveness of classical methods on challenging multimodal tasks with limited data, the advantage of large-scale datasets for deep learning in brain MRI registration, and robust but fallible performance in histology and intraoperative scenarios. The outcomes highlight the need for generalizable, robust, and task-agnostic registration frameworks and call for broader cross-disciplinary participation to extend Learn2Reg’s benchmarking impact across biomedical imaging domains.

Abstract

Medical image registration is critical for clinical applications, and fair benchmarking of different methods is essential for monitoring ongoing progress in the field. To date, the Learn2Reg 2020-2023 challenges have released several complementary datasets and established metrics for evaluations. Building on this foundation, the 2024 edition expands the challenge's scope to cover a wider range of registration scenarios, particularly in terms of modality diversity and task complexity, by introducing three new tasks, including large-scale multi-modal registration and unsupervised inter-subject brain registration, as well as the first microscopy-focused benchmark within Learn2Reg. The new datasets also inspired new method developments, including invertibility constraints, pyramid features, keypoints alignment and instance optimisation. Visit Learn2Reg at https://learn2reg.grand-challenge.org.

Paper Structure

This paper contains 19 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Learn2Reg 2024 continues the Learn2Reg Challenge series, which began in 2020, by introducing three new diverse datasets and expanding the challenge’s scope to cover a wider range of biomedical imaging applications.
  • Figure 2: Learn2Reg 2024 tasks at a glance. This overview illustrates the diversity of the three new tasks in Learn2Reg 2024. ReMIND2Reg challenges participants with intraoperative ultrasound-to-MRI brain registration, addressing brain shift and flexible input modalities. LUMIR focuses on large-scale, inter-subject brain alignment without label supervision, using over 4,000 scans from 10 input sources. COMULISglobe SHG/BF introduces the first histology-based task in the Learn2Reg series, aligning complementary breast tissue modalities—second harmonic generation (SHG) and brightfield (BF) microscopy. All tasks include hidden test data and are evaluated using manual landmarks, label maps, or both.
  • Figure 3: Results for the ReMIND2Reg task. Target Registration Error (TRE) on manual landmarks and TRE30 (the TRE computed on the 30% of landmarks with the largest initial errors) are reported for all methods. * Baseline method provided by organizers.
  • Figure 4: Results for the LUMIR task. The table presents final test set results for all participating teams and baseline methods. Reported metrics include Target Registration Error (TRE) on manual landmarks, Dice-Sørensen coefficient (DSC) and 95th percentile Hausdorff distance (HD95) on brain label maps, as well as the non-diffeomorphic volume (NDV) of displacement fields. The accompanying box plot provides a visual overview of DSC performance distribution across all methods. * Baseline method provided by organizers.
  • Figure 5: Results for the COMULISglobe SHG/BF task. *The mean inter-rater error was computed on a random subset of test cases. While thus not directly comparable to the evaluated methods, it provides a useful reference for the level of alignment that may be achievable. * Baseline method provided by organizers.