Table of Contents
Fetching ...

Expanding Training Data for Endoscopic Phenotyping of Eosinophilic Esophagitis

Juming Xiong, Hou Xiong, Quan Liu, Ruining Deng, Regina N Tyree, Girish Hiremath, Yuankai Huo

TL;DR

This work tackles data scarcity in AI-assisted EoE endoscopy by expanding the training set from 435 to 7050 images through mining online sources and textbooks, enabling a data-efficient classification of EoE phenotypes. It employs a DeiT (Vision Transformer) architecture with a distillation token and Gradient Attention Rollout to achieve accurate, interpretable multi-label classification across six EoE and five non-EoE classes, leveraging both site-specific pediatric data and web-derived images. Key contributions include a diverse, publicly enhanced dataset, an end-to-end transformer-based pipeline, and attention-based visualizations that align with clinically relevant features (e.g., edema, exudates, rings), resulting in improved diagnostic metrics and robustness. The approach holds potential to reduce invasive biopsies by providing scalable, generalizable endoscopic phenotyping across varied clinical contexts.

Abstract

Eosinophilic esophagitis (EoE) is a chronic esophageal disorder marked by eosinophil-dominated inflammation. Diagnosing EoE usually involves endoscopic inspection of the esophageal mucosa and obtaining esophageal biopsies for histologic confirmation. Recent advances have seen AI-assisted endoscopic imaging, guided by the EREFS system, emerge as a potential alternative to reduce reliance on invasive histological assessments. Despite these advancements, significant challenges persist due to the limited availability of data for training AI models - a common issue even in the development of AI for more prevalent diseases. This study seeks to improve the performance of deep learning-based EoE phenotype classification by augmenting our training data with a diverse set of images from online platforms, public datasets, and electronic textbooks increasing our dataset from 435 to 7050 images. We utilized the Data-efficient Image Transformer for image classification and incorporated attention map visualizations to boost interpretability. The findings show that our expanded dataset and model enhancements improved diagnostic accuracy, robustness, and comprehensive analysis, enhancing patient outcomes.

Expanding Training Data for Endoscopic Phenotyping of Eosinophilic Esophagitis

TL;DR

This work tackles data scarcity in AI-assisted EoE endoscopy by expanding the training set from 435 to 7050 images through mining online sources and textbooks, enabling a data-efficient classification of EoE phenotypes. It employs a DeiT (Vision Transformer) architecture with a distillation token and Gradient Attention Rollout to achieve accurate, interpretable multi-label classification across six EoE and five non-EoE classes, leveraging both site-specific pediatric data and web-derived images. Key contributions include a diverse, publicly enhanced dataset, an end-to-end transformer-based pipeline, and attention-based visualizations that align with clinically relevant features (e.g., edema, exudates, rings), resulting in improved diagnostic metrics and robustness. The approach holds potential to reduce invasive biopsies by providing scalable, generalizable endoscopic phenotyping across varied clinical contexts.

Abstract

Eosinophilic esophagitis (EoE) is a chronic esophageal disorder marked by eosinophil-dominated inflammation. Diagnosing EoE usually involves endoscopic inspection of the esophageal mucosa and obtaining esophageal biopsies for histologic confirmation. Recent advances have seen AI-assisted endoscopic imaging, guided by the EREFS system, emerge as a potential alternative to reduce reliance on invasive histological assessments. Despite these advancements, significant challenges persist due to the limited availability of data for training AI models - a common issue even in the development of AI for more prevalent diseases. This study seeks to improve the performance of deep learning-based EoE phenotype classification by augmenting our training data with a diverse set of images from online platforms, public datasets, and electronic textbooks increasing our dataset from 435 to 7050 images. We utilized the Data-efficient Image Transformer for image classification and incorporated attention map visualizations to boost interpretability. The findings show that our expanded dataset and model enhancements improved diagnostic accuracy, robustness, and comprehensive analysis, enhancing patient outcomes.

Paper Structure

This paper contains 14 sections, 1 equation, 3 figures, 3 tables.

Figures (3)

  • Figure 1: This figure shows that we expanded our training set from 435 to 7050 images for training EoE deep learning models. The left panel shows that the traditional AI model only uses rare pediatric EoE data. The right panel shows that we employ a wider range of data from different sources.
  • Figure 2: This figure provides an overview of the learning framework. (a) illustrates the image classes, which include six EoE classes and five non-EoE classes, along with their approximate locations in the upper GI tract. (b) depicts the proportion of each class within the entire upper GI dataset. (c) displays the distribution of the EoE dataset after mining or collection. Finally, (d) shows how the DeiT AI model classifies these images.
  • Figure 3: This figure shows the endoscopy EoE image and corresponding attention map. Image a shows an endoscopic image with label "exudate", whose attention map is presented in image b. Image c presents an endoscopic image with "rings", whose attention map visualization