EndoSight AI: Deep Learning-Driven Real-Time Gastrointestinal Polyp Detection and Segmentation for Enhanced Endoscopic Diagnostics
Daniel Cavadia
TL;DR
EndoSight AI tackles the challenge of real-time GI polyp detection and segmentation by integrating a fast YOLOv8 detector with a dedicated U-Net segmentation model. Trained and evaluated on the Hyper-Kvasir dataset, the system achieves substantial performance with mAP@0.5 = $88.3\%$ for detection and Dice = $0.69$ for segmentation, while delivering >$35$ FPS on GPUs. A key contribution is the thermal-aware training protocol, including real-time GPU monitoring, adaptive cooling, and chunked epochs, which enables robust training on consumer hardware. The work demonstrates practical deployment potential in endoscopy workflows, with live demo evaluation, quantitative metrics, and open-source access to models and demonstrations, promoting reproducibility and broader adoption in GI diagnostics.
Abstract
Precise and real-time detection of gastrointestinal polyps during endoscopic procedures is crucial for early diagnosis and prevention of colorectal cancer. This work presents EndoSight AI, a deep learning architecture developed and evaluated independently to enable accurate polyp localization and detailed boundary delineation. Leveraging the publicly available Hyper-Kvasir dataset, the system achieves a mean Average Precision (mAP) of 88.3% for polyp detection and a Dice coefficient of up to 69% for segmentation, alongside real-time inference speeds exceeding 35 frames per second on GPU hardware. The training incorporates clinically relevant performance metrics and a novel thermal-aware procedure to ensure model robustness and efficiency. This integrated AI solution is designed for seamless deployment in endoscopy workflows, promising to advance diagnostic accuracy and clinical decision-making in gastrointestinal healthcare.
