Table of Contents
Fetching ...

DeepATLAS: One-Shot Localization for Biomedical Data

Peter D. Chang

TL;DR

DeepATLAS tackles dense anatomical localization in high-dimensional biomedical data by learning an anatomically-consistent, universal coordinate embedding. It uses two self-supervised registration objectives, implicit $\Phi_I$ and explicit $\Phi_E$, coupled with a feature-based similarity loss and smoothness regularization to produce stable coordinate maps that generalize across exams. After pretraining on $>51{,}000$ unlabeled CT volumes, one-shot and few-shot segmentation of 51 structures across four external cohorts achieve Dice scores around $0.70$–$0.84$ with $HD_{95}$ in the millimeter-to-centimeter range, often rivaling or exceeding supervised baselines; gains are enhanced by adding external data and semi-supervised fine-tuning. The learned representations support scalable preprocessing, cropping, and downstream tasks, with potential for out-of-distribution detection and active learning, making the approach broadly applicable beyond predefined atlas-based segmentation.

Abstract

This paper introduces the DeepATLAS foundational model for localization tasks in the domain of high-dimensional biomedical data. Upon convergence of the proposed self-supervised objective, a pretrained model maps an input to an anatomically-consistent embedding from which any point or set of points (e.g., boxes or segmentations) may be identified in a one-shot or few-shot approach. As a representative benchmark, a DeepATLAS model pretrained on a comprehensive cohort of 51,000+ unlabeled 3D computed tomography exams yields high one-shot segmentation performance on over 50 anatomic structures across four different external test sets, either matching or exceeding the performance of a standard supervised learning model. Further improvements in accuracy can be achieved by adding a small amount of labeled data using either a semisupervised or more conventional fine-tuning strategy.

DeepATLAS: One-Shot Localization for Biomedical Data

TL;DR

DeepATLAS tackles dense anatomical localization in high-dimensional biomedical data by learning an anatomically-consistent, universal coordinate embedding. It uses two self-supervised registration objectives, implicit and explicit , coupled with a feature-based similarity loss and smoothness regularization to produce stable coordinate maps that generalize across exams. After pretraining on unlabeled CT volumes, one-shot and few-shot segmentation of 51 structures across four external cohorts achieve Dice scores around with in the millimeter-to-centimeter range, often rivaling or exceeding supervised baselines; gains are enhanced by adding external data and semi-supervised fine-tuning. The learned representations support scalable preprocessing, cropping, and downstream tasks, with potential for out-of-distribution detection and active learning, making the approach broadly applicable beyond predefined atlas-based segmentation.

Abstract

This paper introduces the DeepATLAS foundational model for localization tasks in the domain of high-dimensional biomedical data. Upon convergence of the proposed self-supervised objective, a pretrained model maps an input to an anatomically-consistent embedding from which any point or set of points (e.g., boxes or segmentations) may be identified in a one-shot or few-shot approach. As a representative benchmark, a DeepATLAS model pretrained on a comprehensive cohort of 51,000+ unlabeled 3D computed tomography exams yields high one-shot segmentation performance on over 50 anatomic structures across four different external test sets, either matching or exceeding the performance of a standard supervised learning model. Further improvements in accuracy can be achieved by adding a small amount of labeled data using either a semisupervised or more conventional fine-tuning strategy.
Paper Structure (50 sections, 16 equations, 8 figures, 6 tables)

This paper contains 50 sections, 16 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Overview of DeepATLAS Framework. A diverse, unlabeled cohort of 51k CT exams across multiple anatomic regions is used for unsupervised pretraining. The self-supervised pretext task is to label every point in an exam with a unique coordinate, requiring only that consistent coordinates are chosen for the same structure across all exams. Importantly, the contents of the coordinate space are not defined explicitly but rather learned by the model during optimization; after convergence, this derived coordinate space may be visually approximated by a model-reconstructed whole-body atlas. The anatomically-consistent embeddings generated by DeepATLAS may be used to propagate any reference localization task performed just once (such as cropping via bounding box or segmentation) on new exams without further training.
  • Figure 2: DeepATLAS Network Architecture.
  • Figure 3: Summary of Dice Score and Hausdorff Distance Performance. Overall performance is plotted across all 51 anatomic structures and seven experiments for Dice score (top panel) and 95th percentile Hausdorff Distance (bottom panel). The baseline experiment (ATLAS-all) is compared to data-constrained (ATLAS-500 = 1% of data, ATLAS-5k = 10% of data), data-extended (ATLAS-ext = combined with external data), semisupervised (joint loss), as well as supervised (with and without pretraining) experiments. A total of four external test cohorts are included: (1) Head-and-Neck Organ-at-Risk (OAR); (2) CT Multi-Organ (CT-ORG); (3) Chest Organ-at-Risk (StructSeg); (4) Anatomy3 (VISCERAL).
  • Figure 4: One-Shot Segmentation (Head-and-Neck Organ-at-Risk Cohort). A forward-pass of the DeepATLAS model labels every discrete position in an exam with a coordinate matching its underlying anatomy. To visually approximate the predicted anatomy at each location, the generated coordinate map may be used to project the learned atlas (a reconstruction of the learned coordinate space) to any given exam. As shown in the left two panels, the aligned atlas reconstruction exhibits high-fidelity correlation with raw CT data, suggesting the ability to generalize across many anatomic structures including those not directly evaluated in this experiment. In the right panels, single-shot segmentation masks are shown overlaid on the aligned atlas (top panels), raw CT data (middle panels), and ground-truth (bottom panels). In this example, segmentation masks are shown for the lens, orbit, optic nerve, lacrimal gland, cochlea, mandible, parotid gland, submandibular gland, brain, and brainstem.
  • Figure 5: One-Shot Segmentation (Anatomy3 Cohort). In this example, segmentation masks are shown for the lungs, trachea, sternum, thoracic aorta, abdominal aorta, liver, spleen, pancreas, kidneys, iliopsoas muscle, and bladder. See Figure \ref{['fig:oar']} for further details.
  • ...and 3 more figures