ACE: Anatomically Consistent Embeddings in Composition and Decomposition
Ziyu Zhou, Haozhe Luo, Mohammad Reza Hosseinzadeh Taher, Jiaxuan Pang, Xiaowei Ding, Michael Gotway, Jianming Liang
TL;DR
ACE proposes anatomically aware self-supervised learning for medical images by enforcing global consistency and local composition/decomposition through grid-based patch matching. The two-branch framework learns global macro-structures and fine-grained local tissue details, yielding embeddings that support accurate patch-level retrieval, cross-patient anatomical correspondence, and symmetry, with strong transfer to classification and segmentation tasks across chest X-ray and fundus images. The approach demonstrates data-efficient few-shot performance, competitive fine-tuning results, and generalization to other modalities, highlighting the practical impact of incorporating anatomical priors into SSL for medical imaging. Overall, ACE advances annotation-efficient, anatomically grounded representations that improve robustness, interpretability, and transferability in clinical image analysis.
Abstract
Medical images acquired from standardized protocols show consistent macroscopic or microscopic anatomical structures, and these structures consist of composable/decomposable organs and tissues, but existing self-supervised learning (SSL) methods do not appreciate such composable/decomposable structure attributes inherent to medical images. To overcome this limitation, this paper introduces a novel SSL approach called ACE to learn anatomically consistent embedding via composition and decomposition with two key branches: (1) global consistency, capturing discriminative macro-structures via extracting global features; (2) local consistency, learning fine-grained anatomical details from composable/decomposable patch features via corresponding matrix matching. Experimental results across 6 datasets 2 backbones, evaluated in few-shot learning, fine-tuning, and property analysis, show ACE's superior robustness, transferability, and clinical potential. The innovations of our ACE lie in grid-wise image cropping, leveraging the intrinsic properties of compositionality and decompositionality of medical images, bridging the semantic gap from high-level pathologies to low-level tissue anomalies, and providing a new SSL method for medical imaging.
