Table of Contents
Fetching ...

Endoscopic Depth Estimation Based on Deep Learning: A Survey

Ke Niu, Zeyun Liu, Xue Feng, Heng Li, Qika Lin, Kaize Shi

TL;DR

This survey addresses endoscopic depth estimation with deep learning, framing a comprehensive review around data, methods, and clinical applications. It catalogs dataset types (synthetic, phantom, surgical), delineates monocular and stereo DL approaches, and classifies supervision strategies (supervised, semi-supervised, self-supervised, domain adaptation). Key contributions include a structured taxonomy of methods, standard evaluation metrics, and a discussion of challenges and future directions, notably multimodal data fusion and foundation-model–driven knowledge integration. The findings highlight significant progress in real-time 3D reconstruction and navigation but also underscore data scarcity, generalization gaps, and the need for robust clinical validation and explainability to enable widespread translation.

Abstract

Endoscopic depth estimation is a critical technology for improving the safety and precision of minimally invasive surgery. It has attracted considerable attention from researchers in medical imaging, computer vision, and robotics. Over the past decade, a large number of methods have been developed. Despite the existence of several related surveys, a comprehensive overview focusing on recent deep learning-based techniques is still limited. This paper endeavors to bridge this gap by systematically reviewing the state-of-the-art literature. Specifically, we provide a thorough survey of the field from three key perspectives: data, methods, and applications. Firstly, at the data level, we describe the acquisition process of publicly available datasets. Secondly, at the methodological level, we introduce both monocular and stereo deep learning-based approaches for endoscopic depth estimation. Thirdly, at the application level, we identify the specific challenges and corresponding solutions for the clinical implementation of depth estimation technology, situated within concrete clinical scenarios. Finally, we outline potential directions for future research, such as domain adaptation, real-time implementation, and the synergistic fusion of depth information with sensor technologies, thereby providing a valuable starting point for researchers to engage with and advance the field toward clinical translation.

Endoscopic Depth Estimation Based on Deep Learning: A Survey

TL;DR

This survey addresses endoscopic depth estimation with deep learning, framing a comprehensive review around data, methods, and clinical applications. It catalogs dataset types (synthetic, phantom, surgical), delineates monocular and stereo DL approaches, and classifies supervision strategies (supervised, semi-supervised, self-supervised, domain adaptation). Key contributions include a structured taxonomy of methods, standard evaluation metrics, and a discussion of challenges and future directions, notably multimodal data fusion and foundation-model–driven knowledge integration. The findings highlight significant progress in real-time 3D reconstruction and navigation but also underscore data scarcity, generalization gaps, and the need for robust clinical validation and explainability to enable widespread translation.

Abstract

Endoscopic depth estimation is a critical technology for improving the safety and precision of minimally invasive surgery. It has attracted considerable attention from researchers in medical imaging, computer vision, and robotics. Over the past decade, a large number of methods have been developed. Despite the existence of several related surveys, a comprehensive overview focusing on recent deep learning-based techniques is still limited. This paper endeavors to bridge this gap by systematically reviewing the state-of-the-art literature. Specifically, we provide a thorough survey of the field from three key perspectives: data, methods, and applications. Firstly, at the data level, we describe the acquisition process of publicly available datasets. Secondly, at the methodological level, we introduce both monocular and stereo deep learning-based approaches for endoscopic depth estimation. Thirdly, at the application level, we identify the specific challenges and corresponding solutions for the clinical implementation of depth estimation technology, situated within concrete clinical scenarios. Finally, we outline potential directions for future research, such as domain adaptation, real-time implementation, and the synergistic fusion of depth information with sensor technologies, thereby providing a valuable starting point for researchers to engage with and advance the field toward clinical translation.

Paper Structure

This paper contains 33 sections, 10 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Endoscopic depth estimation technology is presented from three perspectives: data, techniques, and applications.
  • Figure 2: The development roadmap of endoscopic depth estimation techniques.
  • Figure 3: The percentage increase of related diseases is depicted in the left panel, whereas the right panel illustrates the Disability-Adjusted Life Years (DALYs) rate for esophageal cancer in China per 100,000 population.
  • Figure 4: Common challenges and general approaches in endoscopic depth estimation.
  • Figure 5: Schematic illustration providing a concise overview of monocular endoscopic depth estimation.
  • ...and 1 more figures