Table of Contents
Fetching ...

Rep-GLS: Report-Guided Generalized Label Smoothing for Robust Disease Detection

Kunyu Zhang, Fukang Ge, Binyang Wang, Yingke Chen, Kazuma Kobayashi, Lin Gu, Jinhao Bi, Yingying Zhu

TL;DR

This work addresses the misalignment between medical uncertainty in radiology reports and traditional binary labels used for training. It introduces Rep-GLS, a two-stage framework that first learns a Rate Generation Network (RGN) to map textual uncertainty from radiology reports into a per-disease GLS rate vector $\mathbf{r}_i \in (-1,1)^K$, and then uses these rates to train a LU-ViT classifier with a Generalized Label Smoothing loss. The approach leverages a large language model to extract structured uncertainty from MIMIC-CXR reports, builds a $\sim$340k image-uncertainty dataset, and demonstrates state-of-the-art performance across 14 chest X-ray diseases, with notable gains on rare conditions. The work also provides ablations and visual analyses to show the importance of adopting expert uncertainty as informative supervision, and it promises public release of the dataset, code, and benchmark for reproducibility and broader impact.

Abstract

Unlike nature image classification where groundtruth label is explicit and of no doubt, physicians commonly interpret medical image conditioned on certainty like using phrase "probable" or "likely". Existing medical image datasets either simply overlooked the nuance and polarise into binary label. Here, we propose a novel framework that leverages a Large Language Model (LLM) to directly mine medical reports to utilise the uncertainty relevant expression for supervision signal. At first, we collect uncertainty keywords from medical reports. Then, we use Qwen-3 4B to identify the textual uncertainty and map them into an adaptive Generalized Label Smoothing (GLS) rate. This rate allows our model to treat uncertain labels not as errors, but as informative signals, effectively incorporating expert skepticism into the training process. We establish a new clinical expert uncertainty-aware benchmark to rigorously evaluate this problem. Experiments demonstrate that our approach significantly outperforms state-of-the-art methods in medical disease detection. The curated uncertainty words database, code, and benchmark will be made publicly available upon acceptance.

Rep-GLS: Report-Guided Generalized Label Smoothing for Robust Disease Detection

TL;DR

This work addresses the misalignment between medical uncertainty in radiology reports and traditional binary labels used for training. It introduces Rep-GLS, a two-stage framework that first learns a Rate Generation Network (RGN) to map textual uncertainty from radiology reports into a per-disease GLS rate vector , and then uses these rates to train a LU-ViT classifier with a Generalized Label Smoothing loss. The approach leverages a large language model to extract structured uncertainty from MIMIC-CXR reports, builds a 340k image-uncertainty dataset, and demonstrates state-of-the-art performance across 14 chest X-ray diseases, with notable gains on rare conditions. The work also provides ablations and visual analyses to show the importance of adopting expert uncertainty as informative supervision, and it promises public release of the dataset, code, and benchmark for reproducibility and broader impact.

Abstract

Unlike nature image classification where groundtruth label is explicit and of no doubt, physicians commonly interpret medical image conditioned on certainty like using phrase "probable" or "likely". Existing medical image datasets either simply overlooked the nuance and polarise into binary label. Here, we propose a novel framework that leverages a Large Language Model (LLM) to directly mine medical reports to utilise the uncertainty relevant expression for supervision signal. At first, we collect uncertainty keywords from medical reports. Then, we use Qwen-3 4B to identify the textual uncertainty and map them into an adaptive Generalized Label Smoothing (GLS) rate. This rate allows our model to treat uncertain labels not as errors, but as informative signals, effectively incorporating expert skepticism into the training process. We establish a new clinical expert uncertainty-aware benchmark to rigorously evaluate this problem. Experiments demonstrate that our approach significantly outperforms state-of-the-art methods in medical disease detection. The curated uncertainty words database, code, and benchmark will be made publicly available upon acceptance.

Paper Structure

This paper contains 24 sections, 4 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: A comparison of approaches for handling noisy labels in medical imaging. (a) Graph-based relabeling methods which are computationally expensive chen2023bomd. (b) A consensus-based method that requires multiple expert annotations, which is costly and unscalable ju2022improving. (c) A sample selection method that utilizes incomplete label data by discarding high-loss samples shao2023lnpl. (d) Our proposed approach (Rep-GLS), which harnesses expert-written uncertainty as a direct supervisory signal through Generalized Label Smoothing. In the report, (un)certainty words are highlighted in red italics and diseases are blue underlined.
  • Figure 2: Overview of our approach compared to traditional methods. (a)(b) Our clinical expert-guided GLS approach with graduated smoothing parameters. (c) Traditional GLS methods with uniform smoothing.
  • Figure 3: Statistics of our newly constructed benchmark. (a) The prompt-based extraction workflow. (b-d) Visualizations of the dataset's characteristics, highlighting the distribution of extracted uncertainty (b, d) and significant class imbalance (c).
  • Figure 4: Grad-CAM attention map comparison.