Generalizing to Unseen Domains in Diabetic Retinopathy with Disentangled Representations

Peng Xia; Ming Hu; Feilong Tang; Wenxue Li; Wenhao Zheng; Lie Ju; Peibo Duan; Huaxiu Yao; Zongyuan Ge

Generalizing to Unseen Domains in Diabetic Retinopathy with Disentangled Representations

Peng Xia, Ming Hu, Feilong Tang, Wenxue Li, Wenhao Zheng, Lie Ju, Peibo Duan, Huaxiu Yao, Zongyuan Ge

TL;DR

Diabetic Retinopathy grading models suffer from domain shifts due to imaging conditions, demographics, and diagnostic criteria, limiting deployment in diverse clinical settings. The authors propose DECO, a disentangled representation framework that separates DR-relevant retinal semantics from domain noise and recombines them across domains to synthesize diverse, domain-invariant features. They further stabilize learning with class prototypes to refine semantic content and domain prototypes to regularize domain noise, using data-aware interpolation, and introduce a robust pixel-level semantic alignment loss to promote dense, intra-class variability while preserving inter-class distinctions. Across comprehensive benchmarks on GDRBench, DECO achieves superior generalization to unseen domains, outperforming state-of-the-art methods, with particular improvements on underrepresented datasets; code is available at the provided GitHub link.

Abstract

Diabetic Retinopathy (DR), induced by diabetes, poses a significant risk of visual impairment. Accurate and effective grading of DR aids in the treatment of this condition. Yet existing models experience notable performance degradation on unseen domains due to domain shifts. Previous methods address this issue by simulating domain style through simple visual transformation and mitigating domain noise via learning robust representations. However, domain shifts encompass more than image styles. They overlook biases caused by implicit factors such as ethnicity, age, and diagnostic criteria. In our work, we propose a novel framework where representations of paired data from different domains are decoupled into semantic features and domain noise. The resulting augmented representation comprises original retinal semantics and domain noise from other domains, aiming to generate enhanced representations aligned with real-world clinical needs, incorporating rich information from diverse domains. Subsequently, to improve the robustness of the decoupled representations, class and domain prototypes are employed to interpolate the disentangled representations while data-aware weights are designed to focus on rare classes and domains. Finally, we devise a robust pixel-level semantic alignment loss to align retinal semantics decoupled from features, maintaining a balance between intra-class diversity and dense class features. Experimental results on multiple benchmarks demonstrate the effectiveness of our method on unseen domains. The code implementations are accessible on https://github.com/richard-peng-xia/DECO.

Generalizing to Unseen Domains in Diabetic Retinopathy with Disentangled Representations

TL;DR

Abstract

Paper Structure (4 sections, 9 equations, 5 figures, 4 tables)

This paper contains 4 sections, 9 equations, 5 figures, 4 tables.

Introduction
Methodology
Experiments
Conclusion

Figures (5)

Figure 1: (Left) An example of fundus-based domain variances. The horizontal distance represents domain differences, and the vertical distance denotes DR category differences. (Right) The motivations of our approach. Firstly, while the augmentation methods are simple visual transformations, we consider more feature-level class-agnostic latent noise, such as macular degeneration caused by age. Additionally, existing pixel-level alignment may act on features containing domain bias, replacing original features with decoupled semantic features to alleviate domain noise.
Figure 2: The overview of our proposed method. (a) Representation decoupling and recombination. (b) Representation enhancement with class and domain prototypes. The specific process is shown in the right panel. (c) Robust pixel-level semantic alignment.
Figure 3: Analysis of augmentation methods and $\mathcal{L}_{pixel}$ under the DG test.
Figure 4: Data distribution for each category in 6 datasets.
Figure 5: Distribution of individual datasets.

Generalizing to Unseen Domains in Diabetic Retinopathy with Disentangled Representations

TL;DR

Abstract

Generalizing to Unseen Domains in Diabetic Retinopathy with Disentangled Representations

Authors

TL;DR

Abstract

Table of Contents

Figures (5)