Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior

Young Seok Jeon; Hongfei Yang; Huazhu Fu; Mengling Feng

Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior

Young Seok Jeon, Hongfei Yang, Huazhu Fu, Mengling Feng

TL;DR

This work tackles anatomically inconsistent predictions in medical image segmentation by introducing AIC-Net, a cascaded global-local framework that injects a learnable Anatomical Prior deformable to patient anatomy. The Prior undergoes differentiable affine and thin-plate-spline deformations, guiding decoders toward anatomy-aware predictions, with refinements at local patches. A novel centroid loss stabilizes prior deformation to common vertebrae configurations, and a regularized loss balances Dice overlap with geometric alignment. The approach yields consistent improvements in Dice and especially Hausdorff distance across organs and vertebrae tasks on TotalSegmentator, demonstrating the practical impact of learnable anatomical priors for robust multi-organ segmentation.

Abstract

Imposing key anatomical features, such as the number of organs, their shapes and relative positions, is crucial for building a robust multi-organ segmentation model. Current attempts to incorporate anatomical features include broadening the effective receptive field (ERF) size with data-intensive modules, or introducing anatomical constraints that scales poorly to multi-organ segmentation. We introduce a novel architecture called the Anatomy-Informed Cascaded Segmentation Network (AIC-Net). AIC-Net incorporates a learnable input termed "Anatomical Prior", which can be adapted to patient-specific anatomy using a differentiable spatial deformation. The deformed prior later guides decoder layers towards more anatomy-informed predictions. We repeat this process at a local patch level to enhance the representation of intricate objects, resulting in a cascaded network structure. AIC-Net is a general method that enhances any existing segmentation models to be more anatomy-aware. We have validated the performance of AIC-Net, with various backbones, on two multi-organ segmentation tasks: abdominal organs and vertebrae. For each respective task, our benchmarks demonstrate improved dice score and Hausdorff distance.

Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior

TL;DR

Abstract

Paper Structure (28 sections, 10 equations, 8 figures, 10 tables)

This paper contains 28 sections, 10 equations, 8 figures, 10 tables.

Introduction
Prior Works
Broadening ERF
Mesh Deformation
Topology regularization
Method
Network Overview
Deform block
Affine block
TPS block
Learnable Anatomical Prior
Aggregation of Anatomical Prior
Loss Function
Dice loss
Centroid loss
...and 13 more sections

Figures (8)

Figure 1: Shall we label the gray spot indicated by the blue arrow adrenal gland? (a) scan slice, (b) ground truth (with adrenal gland label removed) 3D segmentation around the slice, (c) all baseline segmentation wrongly segmented the spot as gland, and (d) AIC-Net gives correct segmentation.
Figure 2: (a) Overview of AIC-Net. AIC-Net is a cascaded network combining global and local views for comprehensive multi-organ segmentation. Initial input $\mathbf{X}_{g}$ yields rough global prediction $\widehat{\mathbf{Y}}{g}$, enhanced by a learnable Anatomical Prior $\widehat{\mathbf{Pr}}_{g}$, a spatially deformed anatomy from learnable parameters $\mathbf{Pr}_{g}$ via $\text{Deform}_g$. This process repeats in the local segment of the model for further enhancements, taking local view $\mathbf{X}_{l}$ and local prior $\mathbf{Pr}_{l}$. (b) The Deform block receives embeddings from vision and prior encoders, concatenates them, and performs affine and TPS deformations on Anatomical Prior. Affine translates each organ. TPS warps the translated organ for more precise matching.
Figure 3: SE-res block is the Squeeze-and-Excitation block with a skip-connection which merges a decoder embedding $\mathbf{z}_{\text{decoder}}^{(l)}$ with a down-sized deformed prior $\widehat{\mathbf{Pr}}^{(l)}$ to produce a refined decoder embedding $\widehat{\mathbf{z}}_{\text{decoder}}^{(l)}$. Layer Norm normalizes high values in $\widehat{\mathbf{Pr}}^{(l)}$.
Figure 4: Impact of centroid loss. A common vertebrae configuration should be learned, while the Deform Block align the prior to right positions. (a) Three common scan types in dataset. Scans always appear at center of padded volume. (b) Without centroid loss (failed case): Deform Block fails to shift with large displacement, and learned prior are forced to adopt three vertebrae configurations. (c) With centroid loss, we can learn a prior with correct anatomy.
Figure 5: Visualizations of learned common priors (left) and their deformation to patient-specific anatomies (right).
...and 3 more figures

Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior

TL;DR

Abstract

Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior

Authors

TL;DR

Abstract

Table of Contents

Figures (8)