Wavelet-based Global-Local Interaction Network with Cross-Attention for Multi-View Diabetic Retinopathy Detection

Yongting Hu; Yuxin Lin; Chengliang Liu; Xiaoling Luo; Xiaoyan Dou; Qihao Xu; Yong Xu

Wavelet-based Global-Local Interaction Network with Cross-Attention for Multi-View Diabetic Retinopathy Detection

Yongting Hu, Yuxin Lin, Chengliang Liu, Xiaoling Luo, Xiaoyan Dou, Qihao Xu, Yong Xu

TL;DR

This work tackles multi-view diabetic retinopathy detection by learning both local lesion details and global context across four fundus views. It introduces a two-branch network (CNN for local features and a Transformer for global dependencies) guided by wavelet high-frequency components to enhance lesion edges, and a Cross-View Fusion Module that employs cross-attention and a learnable query to reduce inter-view redundancy. The Wavelet Based Global-Local Interaction Module and CVFM together achieve superior multi-view fusion, with final predictions formed by fusing branch logits. Experiments on a large MFIDDR dataset show competitive and often superior performance across multiple metrics, and the approach is open-sourced for reproducibility, highlighting its potential impact on automated DR screening.

Abstract

Multi-view diabetic retinopathy (DR) detection has recently emerged as a promising method to address the issue of incomplete lesions faced by single-view DR. However, it is still challenging due to the variable sizes and scattered locations of lesions. Furthermore, existing multi-view DR methods typically merge multiple views without considering the correlations and redundancies of lesion information across them. Therefore, we propose a novel method to overcome the challenges of difficult lesion information learning and inadequate multi-view fusion. Specifically, we introduce a two-branch network to obtain both local lesion features and their global dependencies. The high-frequency component of the wavelet transform is used to exploit lesion edge information, which is then enhanced by global semantic to facilitate difficult lesion learning. Additionally, we present a cross-view fusion module to improve multi-view fusion and reduce redundancy. Experimental results on large public datasets demonstrate the effectiveness of our method. The code is open sourced on https://github.com/HuYongting/WGLIN.

Wavelet-based Global-Local Interaction Network with Cross-Attention for Multi-View Diabetic Retinopathy Detection

TL;DR

Abstract

Wavelet-based Global-Local Interaction Network with Cross-Attention for Multi-View Diabetic Retinopathy Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)