Table of Contents
Fetching ...

Adaptify: A Refined Adaptation Scheme for Frame Classification in Atrophic Gastritis Videos

Zinan Xiong, Shuijiao Chen, Yizhe Zhang, Yu Cao, Benyuan Liu, Xiaowei Liu

TL;DR

Adaptify tackles temporal instability in frame-level gastritis classification from gastroscopy videos by introducing an unsupervised online test-time adaptation framework with a fixed main network and a trainable auxiliary network. It leverages temporal context through a rolling buffer of past auxiliary outputs and a weighted fusion with the current frame, controlled by hyperparameters $\alpha$, $\beta$, and buffer size $K$, and optimizes the auxiliary model via a cross-entropy objective using online updates $\Delta \theta_t^{\text{aux}} = \gamma \Delta \theta_{t-1}^{\text{aux}} + \lambda \nabla_{\theta^{\text{aux}}} \mathcal{L}(y_t^{\text{aux}}, y_t^{\text{cls}})$. The key contributions include the formal Adaptify framework and its extension of AuxAdapt to classification tasks, demonstrated by improved temporal coherence and reduced false positives/negatives in video sequences. Practically, this approach enables more reliable, real-time frame classifications in endoscopic workflows without requiring additional labeled data during deployment.

Abstract

Atrophic gastritis is a significant risk factor for developing gastric cancer. The incorporation of machine learning algorithms can efficiently elevate the possibility of accurately detecting atrophic gastritis. Nevertheless, when the trained model is applied in real-life circumstances, its output is often not consistently reliable. In this paper, we propose Adaptify, an adaptation scheme in which the model assimilates knowledge from its own classification decisions. Our proposed approach includes keeping the primary model constant, while simultaneously running and updating the auxiliary model. By integrating the knowledge gleaned by the auxiliary model into the primary model and merging their outputs, we have observed a notable improvement in output stability and consistency compared to relying solely on either the main model or the auxiliary model.

Adaptify: A Refined Adaptation Scheme for Frame Classification in Atrophic Gastritis Videos

TL;DR

Adaptify tackles temporal instability in frame-level gastritis classification from gastroscopy videos by introducing an unsupervised online test-time adaptation framework with a fixed main network and a trainable auxiliary network. It leverages temporal context through a rolling buffer of past auxiliary outputs and a weighted fusion with the current frame, controlled by hyperparameters , , and buffer size , and optimizes the auxiliary model via a cross-entropy objective using online updates . The key contributions include the formal Adaptify framework and its extension of AuxAdapt to classification tasks, demonstrated by improved temporal coherence and reduced false positives/negatives in video sequences. Practically, this approach enables more reliable, real-time frame classifications in endoscopic workflows without requiring additional labeled data during deployment.

Abstract

Atrophic gastritis is a significant risk factor for developing gastric cancer. The incorporation of machine learning algorithms can efficiently elevate the possibility of accurately detecting atrophic gastritis. Nevertheless, when the trained model is applied in real-life circumstances, its output is often not consistently reliable. In this paper, we propose Adaptify, an adaptation scheme in which the model assimilates knowledge from its own classification decisions. Our proposed approach includes keeping the primary model constant, while simultaneously running and updating the auxiliary model. By integrating the knowledge gleaned by the auxiliary model into the primary model and merging their outputs, we have observed a notable improvement in output stability and consistency compared to relying solely on either the main model or the auxiliary model.
Paper Structure (7 sections, 2 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 7 sections, 2 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: Left: Adaptation considering 1 frame. Right: Adaptation considering K frames.
  • Figure 2: Image exampled from training dataset. Top: atrophic gastritis images; Bottom: healthy images
  • Figure 3: Performance Evaluation on a single video using ResNet-50 and MobileNet-V2. The first one represents the ground truth. The horizontal axis represents the frame number.
  • Figure 4: Performance Evaluation on a single video using ResNet-50 and EfficientNet-B3. The horizontal axis represents the frame number.