Table of Contents
Fetching ...

Structure-to-Image: Zero-Shot Depth Estimation in Colonoscopy via High-Fidelity Sim-to-Real Adaptation

Juan Yang, Yuyan Zhang, Han Jia, Bing Hu, Wanzhong Song

TL;DR

This work is the first to introduce phase congruency to colonoscopic domain adaptation and design a cross-level structure constraint to co-optimize geometric structures and fine-grained details like vascular textures.

Abstract

Monocular depth estimation (MDE) for colonoscopy is hampered by the domain gap between simulated and real-world images. Existing image-to-image translation methods, which use depth as a posterior constraint, often produce structural distortions and specular highlights by failing to balance realism with structure consistency. To address this, we propose a Structure-to-Image paradigm that transforms the depth map from a passive constraint into an active generative foundation. We are the first to introduce phase congruency to colonoscopic domain adaptation and design a cross-level structure constraint to co-optimize geometric structures and fine-grained details like vascular textures. In zero-shot evaluations conducted on a publicly available phantom dataset, the MDE model that was fine-tuned on our generated data achieved a maximum reduction of 44.18% in RMSE compared to competing methods. Our code is available at https://github.com/YyangJJuan/PC-S2I.git.

Structure-to-Image: Zero-Shot Depth Estimation in Colonoscopy via High-Fidelity Sim-to-Real Adaptation

TL;DR

This work is the first to introduce phase congruency to colonoscopic domain adaptation and design a cross-level structure constraint to co-optimize geometric structures and fine-grained details like vascular textures.

Abstract

Monocular depth estimation (MDE) for colonoscopy is hampered by the domain gap between simulated and real-world images. Existing image-to-image translation methods, which use depth as a posterior constraint, often produce structural distortions and specular highlights by failing to balance realism with structure consistency. To address this, we propose a Structure-to-Image paradigm that transforms the depth map from a passive constraint into an active generative foundation. We are the first to introduce phase congruency to colonoscopic domain adaptation and design a cross-level structure constraint to co-optimize geometric structures and fine-grained details like vascular textures. In zero-shot evaluations conducted on a publicly available phantom dataset, the MDE model that was fine-tuned on our generated data achieved a maximum reduction of 44.18% in RMSE compared to competing methods. Our code is available at https://github.com/YyangJJuan/PC-S2I.git.
Paper Structure (14 sections, 6 equations, 7 figures, 3 tables)

This paper contains 14 sections, 6 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Results of existing CycleGAN-based methods for Sim-to-Real colonoscopy show structural distortions and specular artifacts.
  • Figure 2: Structure-to-Image training pipeline with a cross-level structure constraint for geometric and micro-structural alignment
  • Figure 3: Inference process of the proposed method.
  • Figure 4: Stair-step depth map causes contour-like artifacts in the generated image. (a) Depth map of the SimCol dataset, (b) depth profile along the red dashed line of (a), (c) generated realistic image from (a) with depth-to-image translation.
  • Figure 5: Structures extracted with various methods. (a) real colonoscopy image; (b) Y-Channel of (a); (c) phase consistency map of (a); (d)-(h) edges extracted using Roberts, Prewitt, Sobel, Canny, and Laplacian operators, respectively. Phase congruency map shows more and accurate structures.
  • ...and 2 more figures