ECC-PolypDet: Enhanced CenterNet with Contrastive Learning for Automatic Polyp Detection
Yuncheng Jiang, Zixun Zhang, Yiwen Hu, Guanbin Li, Xiang Wan, Song Wu, Shuguang Cui, Silin Huang, Zhen Li
TL;DR
ECC-PolypDet tackles automatic polyp detection in colonoscopy by introducing a two-stage training paradigm that combines Box-assisted Contrastive Learning (BCL) with a Semantic Flow-guided FPN (SFFPN) and Heatmap Propagation (HP) to robustly detect concealed and small polyps; it then applies IoU-guided Sample Re-weighting (ISR) for hard-sample fine-tuning. The approach yields state-of-the-art results across six datasets with strong generalization, while maintaining practical inference speed by keeping most computations limited to training. Key innovations include a box-guided contrastive objective, multi-scale semantic alignment, progressive heatmap refinement, and adaptive loss re-weighting, all demonstrated through extensive ablations and cross-domain tests. The work holds practical significance for clinical colonoscopy by improving detection sensitivity and reducing missed polyps without sacrificing real-time performance.
Abstract
Accurate polyp detection is critical for early colorectal cancer diagnosis. Although remarkable progress has been achieved in recent years, the complex colon environment and concealed polyps with unclear boundaries still pose severe challenges in this area. Existing methods either involve computationally expensive context aggregation or lack prior modeling of polyps, resulting in poor performance in challenging cases. In this paper, we propose the Enhanced CenterNet with Contrastive Learning (ECC-PolypDet), a two-stage training \& end-to-end inference framework that leverages images and bounding box annotations to train a general model and fine-tune it based on the inference score to obtain a final robust model. Specifically, we conduct Box-assisted Contrastive Learning (BCL) during training to minimize the intra-class difference and maximize the inter-class difference between foreground polyps and backgrounds, enabling our model to capture concealed polyps. Moreover, to enhance the recognition of small polyps, we design the Semantic Flow-guided Feature Pyramid Network (SFFPN) to aggregate multi-scale features and the Heatmap Propagation (HP) module to boost the model's attention on polyp targets. In the fine-tuning stage, we introduce the IoU-guided Sample Re-weighting (ISR) mechanism to prioritize hard samples by adaptively adjusting the loss weight for each sample during fine-tuning. Extensive experiments on six large-scale colonoscopy datasets demonstrate the superiority of our model compared with previous state-of-the-art detectors.
