Exploring Transfer Learning for Deep Learning Polyp Detection in Colonoscopy Images Using YOLOv8
Fabian Vazquez, Jose Angel Nuñez, Xiaoyan Fu, Pengfei Gu, Bin Fu
TL;DR
This study tackles the data scarcity challenge in polyp detection by leveraging transfer learning with YOLOv8n. It compares in-domain medical pre-training against out-of-domain general datasets across two pre-training phases and evaluates fine-tuning performance on four polyp datasets, using robust metrics such as $mAP_{50}$ and $mAP_{50:95}$. The results show that all pre-trained models outperform training from scratch, with larger, more diverse pre-training data providing the strongest gains, though out-of-domain sources like COCO can outperform some in-domain datasets. The work contributes practical insights into dataset selection for pre-training in medical imaging, demonstrates faster convergence with pre-trained weights, and releases models and code to support future research in automated polyp detection.
Abstract
Deep learning methods have demonstrated strong performance in objection tasks; however, their ability to learn domain-specific applications with limited training data remains a significant challenge. Transfer learning techniques address this issue by leveraging knowledge from pre-training on related datasets, enabling faster and more efficient learning for new tasks. Finding the right dataset for pre-training can play a critical role in determining the success of transfer learning and overall model performance. In this paper, we investigate the impact of pre-training a YOLOv8n model on seven distinct datasets, evaluating their effectiveness when transferred to the task of polyp detection. We compare whether large, general-purpose datasets with diverse objects outperform niche datasets with characteristics similar to polyps. In addition, we assess the influence of the size of the dataset on the efficacy of transfer learning. Experiments on the polyp datasets show that models pre-trained on relevant datasets consistently outperform those trained from scratch, highlighting the benefit of pre-training on datasets with shared domain-specific features.
