Table of Contents
Fetching ...

Guided Context Gating: Learning to leverage salient lesions in retinal fundus images

Teja Krishna Cherukuri, Nagur Shareef Shaik, Dong Hye Ye

TL;DR

This work addresses the challenge of representing retinal fundus images for diabetic retinopathy by introducing Guided Context Gating, a modular attention mechanism that jointly learns global context, spatial correlations, and lesion-specific local context. The method combines a Convolutional Base (EfficientNetV2B0), Context Formulation, Channel Correlation, and Guided Gating, followed by a Regularized Classification Head to robustly classify DR severity even with imbalanced data. Empirical results on Zenodo-DR-7 (and additional datasets) show higher accuracy and AUC than competing attention mechanisms and Vision Transformers, along with improved explainability via lesion-focused attention maps and discrimination of intra-similar lesions. The approach demonstrates strong potential for clinical deployment and can extend to other medical imaging tasks requiring precise localization of salient pathology.

Abstract

Effectively representing medical images, especially retinal images, presents a considerable challenge due to variations in appearance, size, and contextual information of pathological signs called lesions. Precise discrimination of these lesions is crucial for diagnosing vision-threatening issues such as diabetic retinopathy. While visual attention-based neural networks have been introduced to learn spatial context and channel correlations from retinal images, they often fall short in capturing localized lesion context. Addressing this limitation, we propose a novel attention mechanism called Guided Context Gating, an unique approach that integrates Context Formulation, Channel Correlation, and Guided Gating to learn global context, spatial correlations, and localized lesion context. Our qualitative evaluation against existing attention mechanisms emphasize the superiority of Guided Context Gating in terms of explainability. Notably, experiments on the Zenodo-DR-7 dataset reveal a substantial 2.63% accuracy boost over advanced attention mechanisms & an impressive 6.53% improvement over the state-of-the-art Vision Transformer for assessing the severity grade of retinopathy, even with imbalanced and limited training samples for each class.

Guided Context Gating: Learning to leverage salient lesions in retinal fundus images

TL;DR

This work addresses the challenge of representing retinal fundus images for diabetic retinopathy by introducing Guided Context Gating, a modular attention mechanism that jointly learns global context, spatial correlations, and lesion-specific local context. The method combines a Convolutional Base (EfficientNetV2B0), Context Formulation, Channel Correlation, and Guided Gating, followed by a Regularized Classification Head to robustly classify DR severity even with imbalanced data. Empirical results on Zenodo-DR-7 (and additional datasets) show higher accuracy and AUC than competing attention mechanisms and Vision Transformers, along with improved explainability via lesion-focused attention maps and discrimination of intra-similar lesions. The approach demonstrates strong potential for clinical deployment and can extend to other medical imaging tasks requiring precise localization of salient pathology.

Abstract

Effectively representing medical images, especially retinal images, presents a considerable challenge due to variations in appearance, size, and contextual information of pathological signs called lesions. Precise discrimination of these lesions is crucial for diagnosing vision-threatening issues such as diabetic retinopathy. While visual attention-based neural networks have been introduced to learn spatial context and channel correlations from retinal images, they often fall short in capturing localized lesion context. Addressing this limitation, we propose a novel attention mechanism called Guided Context Gating, an unique approach that integrates Context Formulation, Channel Correlation, and Guided Gating to learn global context, spatial correlations, and localized lesion context. Our qualitative evaluation against existing attention mechanisms emphasize the superiority of Guided Context Gating in terms of explainability. Notably, experiments on the Zenodo-DR-7 dataset reveal a substantial 2.63% accuracy boost over advanced attention mechanisms & an impressive 6.53% improvement over the state-of-the-art Vision Transformer for assessing the severity grade of retinopathy, even with imbalanced and limited training samples for each class.
Paper Structure (14 sections, 8 equations, 3 figures, 3 tables)

This paper contains 14 sections, 8 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Retinal fundus image highlighting various lesions
  • Figure 2: Architecture of proposed Guided Context Gating Network that formulates context from convolutional features and employs it as a guiding signal for computing lesion-specific localized context, by retaining both spatial context and channel correlations; Context Formulation -- selectively focuses on relevant features in the initial spatial representations and computes global context information; Channel Correlation -- processes the computed context information & capture channel-wise correlations; Guided Gating -- utilizes context features to compute lesion contextual attention representations;
  • Figure 3: Visual representation of attention maps using various strategies (spatial, channel, global context, gated, and proposed guided context gating) for severe retinopathy. Spatial attention emphasized local features but lacked broader context, and channel attention captured color and texture well but overlooked spatial context. Global context attention risked oversimplification of lesions, and gating attention emphasized structures but occasionally highlighted unnecessary lesions. Our proposed attention highlights lesion-specific details, combining global and localized context features; dark blue indicates higher attention region.