Table of Contents
Fetching ...

Pathology-Aware Prototype Evolution via LLM-Driven Semantic Disambiguation for Multicenter Diabetic Retinopathy Diagnosis

Chunzheng Zhu, Yangfang Lin, Jialin Shao, Jianxin Lin, Yijun Wang

TL;DR

This work tackles semantic and domain-shift challenges in diabetic retinopathy grading by evolving visual prototypes with pathology-informed semantics. It introduces the Hierarchical Anchor Prototype Modulation (HAPM) framework, combining a variance-spectrum anchor library, hierarchical dynamic prompt gating across LVLM/LLM sources, and a two-stage modulation (PSI and DPE) to inject detailed pathological knowledge into prototypes. Extensive cross-domain experiments across eight public DR datasets show HAPM achieving state-of-the-art performance, particularly in distinguishing adjacent DR grades and border cases, while maintaining a frozen backbone for robustness. The approach enhances interpretability by aligning textual pathology descriptors with visual features, offering a clinically meaningful pathway for robust, cross-domain DR diagnosis.

Abstract

Diabetic retinopathy (DR) grading plays a critical role in early clinical intervention and vision preservation. Recent explorations predominantly focus on visual lesion feature extraction through data processing and domain decoupling strategies. However, they generally overlook domain-invariant pathological patterns and underutilize the rich contextual knowledge of foundation models, relying solely on visual information, which is insufficient for distinguishing subtle pathological variations. Therefore, we propose integrating fine-grained pathological descriptions to complement prototypes with additional context, thereby resolving ambiguities in borderline cases. Specifically, we propose a Hierarchical Anchor Prototype Modulation (HAPM) framework to facilitate DR grading. First, we introduce a variance spectrum-driven anchor prototype library that preserves domain-invariant pathological patterns. We further employ a hierarchical differential prompt gating mechanism, dynamically selecting discriminative semantic prompts from both LVLM and LLM sources to address semantic confusion between adjacent DR grades. Finally, we utilize a two-stage prototype modulation strategy that progressively integrates clinical knowledge into visual prototypes through a Pathological Semantic Injector (PSI) and a Discriminative Prototype Enhancer (DPE). Extensive experiments across eight public datasets demonstrate that our approach achieves pathology-guided prototype evolution while outperforming state-of-the-art methods. The code is available at https://github.com/zhcz328/HAPM.

Pathology-Aware Prototype Evolution via LLM-Driven Semantic Disambiguation for Multicenter Diabetic Retinopathy Diagnosis

TL;DR

This work tackles semantic and domain-shift challenges in diabetic retinopathy grading by evolving visual prototypes with pathology-informed semantics. It introduces the Hierarchical Anchor Prototype Modulation (HAPM) framework, combining a variance-spectrum anchor library, hierarchical dynamic prompt gating across LVLM/LLM sources, and a two-stage modulation (PSI and DPE) to inject detailed pathological knowledge into prototypes. Extensive cross-domain experiments across eight public DR datasets show HAPM achieving state-of-the-art performance, particularly in distinguishing adjacent DR grades and border cases, while maintaining a frozen backbone for robustness. The approach enhances interpretability by aligning textual pathology descriptors with visual features, offering a clinically meaningful pathway for robust, cross-domain DR diagnosis.

Abstract

Diabetic retinopathy (DR) grading plays a critical role in early clinical intervention and vision preservation. Recent explorations predominantly focus on visual lesion feature extraction through data processing and domain decoupling strategies. However, they generally overlook domain-invariant pathological patterns and underutilize the rich contextual knowledge of foundation models, relying solely on visual information, which is insufficient for distinguishing subtle pathological variations. Therefore, we propose integrating fine-grained pathological descriptions to complement prototypes with additional context, thereby resolving ambiguities in borderline cases. Specifically, we propose a Hierarchical Anchor Prototype Modulation (HAPM) framework to facilitate DR grading. First, we introduce a variance spectrum-driven anchor prototype library that preserves domain-invariant pathological patterns. We further employ a hierarchical differential prompt gating mechanism, dynamically selecting discriminative semantic prompts from both LVLM and LLM sources to address semantic confusion between adjacent DR grades. Finally, we utilize a two-stage prototype modulation strategy that progressively integrates clinical knowledge into visual prototypes through a Pathological Semantic Injector (PSI) and a Discriminative Prototype Enhancer (DPE). Extensive experiments across eight public datasets demonstrate that our approach achieves pathology-guided prototype evolution while outperforming state-of-the-art methods. The code is available at https://github.com/zhcz328/HAPM.

Paper Structure

This paper contains 19 sections, 16 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: (a) Same DR grade appears differently across domains, and subtle differences between adjacent grades easily cause confusion. (b) Our framework combines LLM and LVLM technologies for accurate and efficient grading.
  • Figure 2: Overview of our method. We first build an anchor prototype library using variance spectrum analysis, then apply a Hierarchical Dynamic Prompt (HDP) Gating to select discriminative prompts. The prototypes are enhanced via two-stage modulation with the Pathological Semantic Injector (PSI) and Discriminative Prototype Enhancer (DPE) for DR grading.
  • Figure 3: HDP Gating selectively filters the most discriminative prompts from both LLM and LVLM sources to reduce semantic confusion between adjacent DR grades.
  • Figure 4: Selected prompts with larger inter-class distance.
  • Figure 5: The DG performance comparison on six benchmark datasets and average levels. The red areas indicate our method’s performance gain over others on each dataset.
  • ...and 2 more figures