Table of Contents
Fetching ...

RANGER: Sparsely-Gated Mixture-of-Experts with Adaptive Retrieval Re-ranking for Pathology Report Generation

Yixin Chen, Ziyu Su, Hikmat Khan, Muhammad Khalid Khan Niazi

TL;DR

RANGER, a sparsely-gated Mixture-of-Experts (MoE) framework with adaptive retrieval re-ranking for pathology report generation, integrates a sparsely gated MoE into the decoder, along with noisy top-$k$ routing and load-balancing regularization, to enable dynamic expert specialization across various diagnostic patterns.

Abstract

Pathology report generation remains a relatively under-explored downstream task, primarily due to the gigapixel scale and complex morphological heterogeneity of Whole Slide Images (WSIs). Existing pathology report generation frameworks typically employ transformer architectures, relying on a homogeneous decoder architecture and static knowledge retrieval integration. Such architectures limit generative specialization and may introduce noisy external guidance during the report generation process. To address these limitations, we propose RANGER, a sparsely-gated Mixture-of-Experts (MoE) framework with adaptive retrieval re-ranking for pathology report generation. Specifically, we integrate a sparsely gated MoE into the decoder, along with noisy top-$k$ routing and load-balancing regularization, to enable dynamic expert specialization across various diagnostic patterns. Additionally, we introduce an adaptive retrieval re-ranking module that selectively refines retrieved memory from a knowledge base before integration, reducing noise and improving semantic alignment based on visual feature representations. We perform extensive experiments on the PathText-BRCA dataset and demonstrate consistent improvements over existing approaches across standard natural language generation metrics. Our full RANGER model achieves optimal performance on PathText dataset, reaching BLEU-1 to BLEU-4 scores of 0.4598, 0.3044, 0.2036, and 0.1435, respectively, with METEOR of 0.1883, and ROUGE-L of 0.3038, validating the effectiveness of dynamic expert routing and adaptive knowledge refinement for semantically grounded pathology report generation.

RANGER: Sparsely-Gated Mixture-of-Experts with Adaptive Retrieval Re-ranking for Pathology Report Generation

TL;DR

RANGER, a sparsely-gated Mixture-of-Experts (MoE) framework with adaptive retrieval re-ranking for pathology report generation, integrates a sparsely gated MoE into the decoder, along with noisy top- routing and load-balancing regularization, to enable dynamic expert specialization across various diagnostic patterns.

Abstract

Pathology report generation remains a relatively under-explored downstream task, primarily due to the gigapixel scale and complex morphological heterogeneity of Whole Slide Images (WSIs). Existing pathology report generation frameworks typically employ transformer architectures, relying on a homogeneous decoder architecture and static knowledge retrieval integration. Such architectures limit generative specialization and may introduce noisy external guidance during the report generation process. To address these limitations, we propose RANGER, a sparsely-gated Mixture-of-Experts (MoE) framework with adaptive retrieval re-ranking for pathology report generation. Specifically, we integrate a sparsely gated MoE into the decoder, along with noisy top- routing and load-balancing regularization, to enable dynamic expert specialization across various diagnostic patterns. Additionally, we introduce an adaptive retrieval re-ranking module that selectively refines retrieved memory from a knowledge base before integration, reducing noise and improving semantic alignment based on visual feature representations. We perform extensive experiments on the PathText-BRCA dataset and demonstrate consistent improvements over existing approaches across standard natural language generation metrics. Our full RANGER model achieves optimal performance on PathText dataset, reaching BLEU-1 to BLEU-4 scores of 0.4598, 0.3044, 0.2036, and 0.1435, respectively, with METEOR of 0.1883, and ROUGE-L of 0.3038, validating the effectiveness of dynamic expert routing and adaptive knowledge refinement for semantically grounded pathology report generation.
Paper Structure (28 sections, 15 equations, 4 figures, 4 tables)

This paper contains 28 sections, 15 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overview of the proposed RANGER framework. (a) Visual branch for whole-slide image feature extraction. (b) Dual learnable tokens that cross-attend to visual features and retrieved textual knowledge, respectively, enabling cross-modal interaction. (c) Construction of a sentence-level memory bank from historical pathology reports. (d) Two-stage retrieval module that first performs similarity-based candidate selection and then applies a learned re-ranking mechanism for adaptive knowledge refinement. (e) Sparsely-gated Mixture-of-Experts (MoE) decoder with noisy top-$k$ routing and load-balancing regularization for conditional expert specialization during report generation.
  • Figure 2: Qualitative comparison of generated and reference pathology report text. Correctly generated words and phrases are highlighted in red.
  • Figure 3: Ablation study on the Mixture-of-Experts (MoE) decoder. Left: Performance variation with different numbers of experts. Right: Effect of top-$k$ routing during expert activation. Moderate expert capacity with sparse routing achieves the best trade-off between specialization and stability.
  • Figure 4: Effect of load-balance coefficient $\lambda$ on MoE decoder performance (BLEU-4). $\lambda$=0.01 yields the best performance. Too-small $\lambda$ (0.001) fails to prevent expert collapse and performs worse than no load balancing, while too-large $\lambda$ (0.1) over-constrains routing. Dashed line: cosine-retrieval baseline