Table of Contents
Fetching ...

Enhanced Landmark Detection Model in Pelvic Fluoroscopy using 2D/3D Registration Loss

Chou Mo, Yehyun Suh, J. Ryan Martin, Daniel Moyer

TL;DR

The paper tackles robust pelvic landmark detection under variable intraoperative poses in fluoroscopy by framing landmark localization as coupled with 2D/3D registration. It introduces a training regime that adds a pose estimation loss derived from 2D/3D registration and compares Baseline, Pose Estimation Loss (PEL), and sequential fine-tuning with PEL. Results show that sequentially fine-tuning the U-Net with PEL improves mean RMSE on internal and external datasets (approx. $8.58\rightarrow 8.45$ mm internally and $5.58\rightarrow 5.09$ mm externally), while composite-loss or pure PEL training underperform or diverge. The findings suggest that geometry-aware training curricula, leveraging 3D constraints, can meaningfully enhance 2D landmark localization for intraoperative pelvic imaging.

Abstract

Automated landmark detection offers an efficient approach for medical professionals to understand patient anatomic structure and positioning using intra-operative imaging. While current detection methods for pelvic fluoroscopy demonstrate promising accuracy, most assume a fixed Antero-Posterior view of the pelvis. However, orientation often deviates from this standard view, either due to repositioning of the imaging unit or of the target structure itself. To address this limitation, we propose a novel framework that incorporates 2D/3D landmark registration into the training of a U-Net landmark prediction model. We analyze the performance difference by comparing landmark detection accuracy between the baseline U-Net, U-Net trained with Pose Estimation Loss, and U-Net fine-tuned with Pose Estimation Loss under realistic intra-operative conditions where patient pose is variable.

Enhanced Landmark Detection Model in Pelvic Fluoroscopy using 2D/3D Registration Loss

TL;DR

The paper tackles robust pelvic landmark detection under variable intraoperative poses in fluoroscopy by framing landmark localization as coupled with 2D/3D registration. It introduces a training regime that adds a pose estimation loss derived from 2D/3D registration and compares Baseline, Pose Estimation Loss (PEL), and sequential fine-tuning with PEL. Results show that sequentially fine-tuning the U-Net with PEL improves mean RMSE on internal and external datasets (approx. mm internally and mm externally), while composite-loss or pure PEL training underperform or diverge. The findings suggest that geometry-aware training curricula, leveraging 3D constraints, can meaningfully enhance 2D landmark localization for intraoperative pelvic imaging.

Abstract

Automated landmark detection offers an efficient approach for medical professionals to understand patient anatomic structure and positioning using intra-operative imaging. While current detection methods for pelvic fluoroscopy demonstrate promising accuracy, most assume a fixed Antero-Posterior view of the pelvis. However, orientation often deviates from this standard view, either due to repositioning of the imaging unit or of the target structure itself. To address this limitation, we propose a novel framework that incorporates 2D/3D landmark registration into the training of a U-Net landmark prediction model. We analyze the performance difference by comparing landmark detection accuracy between the baseline U-Net, U-Net trained with Pose Estimation Loss, and U-Net fine-tuned with Pose Estimation Loss under realistic intra-operative conditions where patient pose is variable.

Paper Structure

This paper contains 12 sections, 10 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Model pipeline diagram for three tested models. From top to bottom: (a) Baseline U-Net, (b) U-Net trained on pose estimation loss, and (c) U-Net fine-tuned on pose estimation loss. $\theta$ denotes the predicted pose, and $\theta^*$ denotes the ground truth pose.
  • Figure 2: Landmark placement of eight anatomical landmarks on the pelvic bone, enlarged for visualization.
  • Figure 3: Model pipeline diagram for three tested models. From top to bottom: (a) Baseline U-Net, (b) U-Net trained on pose estimation loss, and (c) U-Net fine-tuned on pose estimation loss. $\theta$ denotes the predicted pose, and $\theta^*$ denotes the ground truth pose.