Salt & Pepper Heatmaps: Diffusion-informed Landmark Detection Strategy
Julian Wyatt, Irina Voiculescu
TL;DR
This work tackles automatic Anatomical Landmark Detection by casting it as a diffusion-based generative task that yields multi-channel heatmaps capturing uncertainty. The authors introduce a single-step diffusion formulation with a multi-step backbone, a time-encoded U-Net, and a loss that combines a supervised term with a spatially normalized cross-entropy, sharpened by a gradually increasing Gaussian blur during reverse diffusion. On a cephalometric dataset, the proposed multi-step diffusion approach achieves state-of-the-art MRE and competitive SDR relative to prior methods, while single-step estimates lag behind, highlighting the benefit of iterative refinement. The method offers accurate, uncertainty-aware landmark localization with potential efficiency gains and motivates future extensions to region-based predictions and informed priors to further improve speed and accuracy in clinical workflows.
Abstract
Anatomical Landmark Detection is the process of identifying key areas of an image for clinical measurements. Each landmark is a single ground truth point labelled by a clinician. A machine learning model predicts the locus of a landmark as a probability region represented by a heatmap. Diffusion models have increased in popularity for generative modelling due to their high quality sampling and mode coverage, leading to their adoption in medical image processing for semantic segmentation. Diffusion modelling can be further adapted to learn a distribution over landmarks. The stochastic nature of diffusion models captures fluctuations in the landmark prediction, which we leverage by blurring into meaningful probability regions. In this paper, we reformulate automatic Anatomical Landmark Detection as a precise generative modelling task, producing a few-hot pixel heatmap. Our method achieves state-of-the-art MRE and comparable SDR performance with existing work.
