Capsule Vision 2024 Challenge: Multi-Class Abnormality Classification for Video Capsule Endoscopy
Palak Handa, Amirreza Mahbod, Florian Schwarzhans, Ramona Woitek, Nidhi Goel, Manas Dhir, Deepti Chhabra, Shreshtha Jha, Pallavi Sharma, Vijay Thakur, Simarpreet Singh Chawla, Deepak Gunjan, Jagadeesh Kakarla, Balasubramanian Raman
TL;DR
The Capsule Vision 2024 Challenge addresses automatic multi-class abnormality classification in Video Capsule Endoscopy by assembling large-scale, cross-source datasets and a rigorous, public benchmarking protocol. It constructs training and validation sets from SEE-AI, KID, KVASIR-Capsule, and AIIMS, totaling 37,607 and 16,132 frames across 10 classes plus normal, and a private test set of 4,385 frames; evaluation uses a combined metric of mean AUC and balanced accuracy. The paper details registration, rules, dataset development, and the results of 27 qualified teams, highlighting transformer- and CNN-based architectures, often with ensembles or transfer learning, and a reproducible pipeline for VCE AI research. This work aims to reduce clinician reading time while preserving diagnostic accuracy and provides a public benchmark to accelerate practical AI deployment in gastroenterology. Overall, the challenge demonstrates the feasibility and impact of standardized, open VCE AI benchmarking for robust, clinically relevant abnormality classification.
Abstract
We present the Capsule Vision 2024 Challenge: Multi-Class Abnormality Classification for Video Capsule Endoscopy. It was virtually organized by the Research Center for Medical Image Analysis and Artificial Intelligence (MIAAI), Department of Medicine, Danube Private University, Krems, Austria in collaboration with the 9th International Conference on Computer Vision & Image Processing (CVIP 2024) being organized by the Indian Institute of Information Technology, Design and Manufacturing (IIITDM) Kancheepuram, Chennai, India. This document provides an overview of the challenge, including the registration process, rules, submission format, description of the datasets used, qualified team rankings, all team descriptions, and the benchmarking results reported by the organizers.
