Table of Contents
Fetching ...

Capsule Vision Challenge 2024: Multi-Class Abnormality Classification for Video Capsule Endoscopy

Aakarsh Bansal, Bhuvanesh Singla, Raajan Rajesh Wankhade, Nagamma Patil

TL;DR

This work implemented a tiered augmentation strategy using the albumentations library to enhance minority class representation and addressed learning complexities by progressively structuring training tasks, allowing the model to differentiate between normal and abnormal cases and then gradually adding more specific classes based on data availability.

Abstract

This study presents an approach to developing a model for classifying abnormalities in video capsule endoscopy (VCE) frames. Given the challenges of data imbalance, we implemented a tiered augmentation strategy using the albumentations library to enhance minority class representation. Additionally, we addressed learning complexities by progressively structuring training tasks, allowing the model to differentiate between normal and abnormal cases and then gradually adding more specific classes based on data availability. Our pipeline, developed in PyTorch, employs a flexible architecture enabling seamless adjustments to classification complexity. We tested our approach using ResNet50 and a custom ViT-CNN hybrid model, with training conducted on the Kaggle platform. This work demonstrates a scalable approach to abnormality classification in VCE.

Capsule Vision Challenge 2024: Multi-Class Abnormality Classification for Video Capsule Endoscopy

TL;DR

This work implemented a tiered augmentation strategy using the albumentations library to enhance minority class representation and addressed learning complexities by progressively structuring training tasks, allowing the model to differentiate between normal and abnormal cases and then gradually adding more specific classes based on data availability.

Abstract

This study presents an approach to developing a model for classifying abnormalities in video capsule endoscopy (VCE) frames. Given the challenges of data imbalance, we implemented a tiered augmentation strategy using the albumentations library to enhance minority class representation. Additionally, we addressed learning complexities by progressively structuring training tasks, allowing the model to differentiate between normal and abnormal cases and then gradually adding more specific classes based on data availability. Our pipeline, developed in PyTorch, employs a flexible architecture enabling seamless adjustments to classification complexity. We tested our approach using ResNet50 and a custom ViT-CNN hybrid model, with training conducted on the Kaggle platform. This work demonstrates a scalable approach to abnormality classification in VCE.

Paper Structure

This paper contains 11 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Pre-augmentation Class Distribution (with Normal class)
  • Figure 2: Pre-augmentation Class Distribution (without Normal class)
  • Figure 3: Post-augmentation Class Distribution (without Normal class)
  • Figure 4: ViT-CNN Architecture and GradCam
  • Figure 5: Block diagram of the developed pipeline.