Table of Contents
Fetching ...

Landmarks Are Alike Yet Distinct: Harnessing Similarity and Individuality for One-Shot Medical Landmark Detection

Xu He, Zhen Huang, Qingsong Yao, Xiaoqian Zhou, S. Kevin Zhou

TL;DR

The paper tackles the seesaw phenomenon in multi-landmark medical landmark detection by first training distinct single-landmark models (SLA) using pseudo-labels and continually updated template data to leverage landmark individuality. It then introduces an adapter-based fusion approach that shares parameters across landmarks while preserving landmark-specific learning, improving resource efficiency. Empirical results on the ISBI 2015 Head dataset show that single-landmark models outperform joint training, with the CC2D-SLA-ATD-Adapter achieving state-of-the-art performance while reducing memory and computation. Together, the methods offer a scalable, efficient framework for one-shot medical landmark detection with potential applicability to broader medical imaging tasks.

Abstract

Landmark detection plays a crucial role in medical imaging applications such as disease diagnosis, bone age estimation, and therapy planning. However, training models for detecting multiple landmarks simultaneously often encounters the "seesaw phenomenon", where improvements in detecting certain landmarks lead to declines in detecting others. Yet, training a separate model for each landmark increases memory usage and computational overhead. To address these challenges, we propose a novel approach based on the belief that "landmarks are distinct" by training models with pseudo-labels and template data updated continuously during the training process, where each model is dedicated to detecting a single landmark to achieve high accuracy. Furthermore, grounded on the belief that "landmarks are also alike", we introduce an adapter-based fusion model, combining shared weights with landmark-specific weights, to efficiently share model parameters while allowing flexible adaptation to individual landmarks. This approach not only significantly reduces memory and computational resource requirements but also effectively mitigates the seesaw phenomenon in multi-landmark training. Experimental results on publicly available medical image datasets demonstrate that the single-landmark models significantly outperform traditional multi-point joint training models in detecting individual landmarks. Although our adapter-based fusion model shows slightly lower performance compared to the combined results of all single-landmark models, it still surpasses the current state-of-the-art methods while achieving a notable improvement in resource efficiency.

Landmarks Are Alike Yet Distinct: Harnessing Similarity and Individuality for One-Shot Medical Landmark Detection

TL;DR

The paper tackles the seesaw phenomenon in multi-landmark medical landmark detection by first training distinct single-landmark models (SLA) using pseudo-labels and continually updated template data to leverage landmark individuality. It then introduces an adapter-based fusion approach that shares parameters across landmarks while preserving landmark-specific learning, improving resource efficiency. Empirical results on the ISBI 2015 Head dataset show that single-landmark models outperform joint training, with the CC2D-SLA-ATD-Adapter achieving state-of-the-art performance while reducing memory and computation. Together, the methods offer a scalable, efficient framework for one-shot medical landmark detection with potential applicability to broader medical imaging tasks.

Abstract

Landmark detection plays a crucial role in medical imaging applications such as disease diagnosis, bone age estimation, and therapy planning. However, training models for detecting multiple landmarks simultaneously often encounters the "seesaw phenomenon", where improvements in detecting certain landmarks lead to declines in detecting others. Yet, training a separate model for each landmark increases memory usage and computational overhead. To address these challenges, we propose a novel approach based on the belief that "landmarks are distinct" by training models with pseudo-labels and template data updated continuously during the training process, where each model is dedicated to detecting a single landmark to achieve high accuracy. Furthermore, grounded on the belief that "landmarks are also alike", we introduce an adapter-based fusion model, combining shared weights with landmark-specific weights, to efficiently share model parameters while allowing flexible adaptation to individual landmarks. This approach not only significantly reduces memory and computational resource requirements but also effectively mitigates the seesaw phenomenon in multi-landmark training. Experimental results on publicly available medical image datasets demonstrate that the single-landmark models significantly outperform traditional multi-point joint training models in detecting individual landmarks. Although our adapter-based fusion model shows slightly lower performance compared to the combined results of all single-landmark models, it still surpasses the current state-of-the-art methods while achieving a notable improvement in resource efficiency.

Paper Structure

This paper contains 10 sections, 5 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Training framework of CC2D-SLA-ATD, which consists of three stages at each epoch: Train-Template, Train-PL, and Infer-PL. CC2D-SLA training, on the other hand, is composed of only the Train-PL and Infer-PL stages.
  • Figure 2: Network architecture of CC2D-SLA-ATD-Adapter. The upper-left part of the figure shows an illustration of the adapter-based feature map transformation.
  • Figure 3: The performances of our methods with different channel sizes.