Table of Contents
Fetching ...

Error Correction in Radiology Reports: A Knowledge Distillation-Based Multi-Stage Framework

Jinge Wu, Zhaolong Wu, Ruizhe Li, Tong Chen, Abul Hasan, Yunsoo Kim, Jason P. Y. Cheung, Teng Zhang, Honghan Wu

TL;DR

The paper tackles the problem of error-prone radiology reports by introducing a staged proofreading framework that decomposes proofreading into detection, localization, and correction, guided by dual knowledge infusion. Medical Knowledge Graph Distillation (MKGD) converts reports into structured graphs using RadGraph, while External Knowledge Retrieval (EXKR) brings in external reference patterns, enabling precise and clinically grounded corrections without extensive fine-tuning. A comprehensive MIMIC-CXR-based benchmark with real-world error patterns validates the approach, showing substantial gains in error detection and localization, plus improved factual consistency and efficiency, as confirmed by radiologists. The work demonstrates that combining structured medical knowledge with targeted retrieval and staged reasoning transforms general LLMs into safer, interpretable radiology QA tools, with practical impact for high-volume clinical settings.

Abstract

The increasing complexity and workload of clinical radiology leads to inevitable oversights and mistakes in their use as diagnostic tools, causing delayed treatments and sometimes life-threatening harm to patients. While large language models (LLMs) have shown remarkable progress in many tasks, their utilities in detecting and correcting errors in radiology reporting are limited. This paper proposes a novel dual-knowledge infusion framework that enhances LLMs' capability for radiology report proofreading through systematic integration of medical expertise. Specifically, the knowledge infusion combines medical knowledge graph distillation (MKGD) with external knowledge retrieval (EXKR), enabling an effective automated approach in tackling mistakes in radiology reporting. By decomposing the complex proofreading task into three specialized stages of detection, localization, and correction, our method mirrors the systematic review process employed by expert radiologists, ensuring both precision and clinical interpretability. To perform a robust, clinically relevant evaluation, a comprehensive benchmark is also proposed using real-world radiology reports with real-world error patterns, including speech recognition confusions, terminology ambiguities, and template-related inconsistencies. Extensive evaluations across multiple LLM architectures demonstrate substantial improvements of our approach: up to 31.56% increase in error detection accuracy and 37.4% reduction in processing time. Human evaluation by radiologists confirms superior clinical relevance and factual consistency compared to existing approaches.

Error Correction in Radiology Reports: A Knowledge Distillation-Based Multi-Stage Framework

TL;DR

The paper tackles the problem of error-prone radiology reports by introducing a staged proofreading framework that decomposes proofreading into detection, localization, and correction, guided by dual knowledge infusion. Medical Knowledge Graph Distillation (MKGD) converts reports into structured graphs using RadGraph, while External Knowledge Retrieval (EXKR) brings in external reference patterns, enabling precise and clinically grounded corrections without extensive fine-tuning. A comprehensive MIMIC-CXR-based benchmark with real-world error patterns validates the approach, showing substantial gains in error detection and localization, plus improved factual consistency and efficiency, as confirmed by radiologists. The work demonstrates that combining structured medical knowledge with targeted retrieval and staged reasoning transforms general LLMs into safer, interpretable radiology QA tools, with practical impact for high-volume clinical settings.

Abstract

The increasing complexity and workload of clinical radiology leads to inevitable oversights and mistakes in their use as diagnostic tools, causing delayed treatments and sometimes life-threatening harm to patients. While large language models (LLMs) have shown remarkable progress in many tasks, their utilities in detecting and correcting errors in radiology reporting are limited. This paper proposes a novel dual-knowledge infusion framework that enhances LLMs' capability for radiology report proofreading through systematic integration of medical expertise. Specifically, the knowledge infusion combines medical knowledge graph distillation (MKGD) with external knowledge retrieval (EXKR), enabling an effective automated approach in tackling mistakes in radiology reporting. By decomposing the complex proofreading task into three specialized stages of detection, localization, and correction, our method mirrors the systematic review process employed by expert radiologists, ensuring both precision and clinical interpretability. To perform a robust, clinically relevant evaluation, a comprehensive benchmark is also proposed using real-world radiology reports with real-world error patterns, including speech recognition confusions, terminology ambiguities, and template-related inconsistencies. Extensive evaluations across multiple LLM architectures demonstrate substantial improvements of our approach: up to 31.56% increase in error detection accuracy and 37.4% reduction in processing time. Human evaluation by radiologists confirms superior clinical relevance and factual consistency compared to existing approaches.
Paper Structure (19 sections, 1 equation, 3 figures, 3 tables)

This paper contains 19 sections, 1 equation, 3 figures, 3 tables.

Figures (3)

  • Figure 1: An overview of our medical report proofreading framework. The staged inference process (top) breaks down error correction into detection (identifying error presence), localization (pinpointing error terms like "congestion"), and correction (providing proper replacements like "consolidation"). The dual-knowledge infusion framework (bottom) supports this process through MKGD's structural analysis and EXKR's domain knowledge integration, enabling accurate and clinically sound corrections.
  • Figure 2: Illustration of our dual-knowledge infusion framework. Left: Input medical report with task description and reference examples. Right: MKGD transforms the report into a structured graph representation capturing anatomical entities (ANAT-DP) and observations (OBS-DP/DA) with their relationships (modify, located_at, suggestive_of), while EXKR provides relevant domain knowledge from reference reports to guide the correction process.
  • Figure 3: Human evaluation comparison of error correction on baseline versus our proposed staged proofreading inference with dual-knowledge infusion framework.