Table of Contents
Fetching ...

CopilotCAD: Empowering Radiologists with Report Completion Models and Quantitative Evidence from Medical Image Foundation Models

Sheng Wang, Tianming Du, Katherine Fischer, Gregory E Tasian, Justin Ziemba, Joanie M Garratt, Hersh Sagreiya, Yong Fan

TL;DR

CopilotCAD presents an assistive radiology framework that couples Large Language Models with medical image foundation tools to generate structured report content and quantitative evidence while keeping radiologists in the decision loop. By routing image analysis through a model ensemble and using radiomics-informed prompts, the system delivers draft reports that radiologists review, improving report completion quality at reduced computational cost ($L2-L3$ autonomy). Early experiments on 22,109 CT urography reports show notable gains in BLEU-4 and ROUGE metrics when radiomics are included, with partial generalization to other organs and imaging modalities. While promising for reducing workload and increasing explainability, the approach remains a prototype with limitations in abnormality detection, latency, and zero-shot adaptability, highlighting a pathway toward safer, semi-automatic radiology reporting.

Abstract

Computer-aided diagnosis systems hold great promise to aid radiologists and clinicians in radiological clinical practice and enhance diagnostic accuracy and efficiency. However, the conventional systems primarily focus on delivering diagnostic results through text report generation or medical image classification, positioning them as standalone decision-makers rather than helpers and ignoring radiologists' expertise. This study introduces an innovative paradigm to create an assistive co-pilot system for empowering radiologists by leveraging Large Language Models (LLMs) and medical image analysis tools. Specifically, we develop a collaborative framework to integrate LLMs and quantitative medical image analysis results generated by foundation models with radiologists in the loop, achieving efficient and safe generation of radiology reports and effective utilization of computational power of AI and the expertise of medical professionals. This approach empowers radiologists to generate more precise and detailed diagnostic reports, enhancing patient outcomes while reducing the burnout of clinicians. Our methodology underscores the potential of AI as a supportive tool in medical diagnostics, promoting a harmonious integration of technology and human expertise to advance the field of radiology.

CopilotCAD: Empowering Radiologists with Report Completion Models and Quantitative Evidence from Medical Image Foundation Models

TL;DR

CopilotCAD presents an assistive radiology framework that couples Large Language Models with medical image foundation tools to generate structured report content and quantitative evidence while keeping radiologists in the decision loop. By routing image analysis through a model ensemble and using radiomics-informed prompts, the system delivers draft reports that radiologists review, improving report completion quality at reduced computational cost ( autonomy). Early experiments on 22,109 CT urography reports show notable gains in BLEU-4 and ROUGE metrics when radiomics are included, with partial generalization to other organs and imaging modalities. While promising for reducing workload and increasing explainability, the approach remains a prototype with limitations in abnormality detection, latency, and zero-shot adaptability, highlighting a pathway toward safer, semi-automatic radiology reporting.

Abstract

Computer-aided diagnosis systems hold great promise to aid radiologists and clinicians in radiological clinical practice and enhance diagnostic accuracy and efficiency. However, the conventional systems primarily focus on delivering diagnostic results through text report generation or medical image classification, positioning them as standalone decision-makers rather than helpers and ignoring radiologists' expertise. This study introduces an innovative paradigm to create an assistive co-pilot system for empowering radiologists by leveraging Large Language Models (LLMs) and medical image analysis tools. Specifically, we develop a collaborative framework to integrate LLMs and quantitative medical image analysis results generated by foundation models with radiologists in the loop, achieving efficient and safe generation of radiology reports and effective utilization of computational power of AI and the expertise of medical professionals. This approach empowers radiologists to generate more precise and detailed diagnostic reports, enhancing patient outcomes while reducing the burnout of clinicians. Our methodology underscores the potential of AI as a supportive tool in medical diagnostics, promoting a harmonious integration of technology and human expertise to advance the field of radiology.
Paper Structure (13 sections, 6 figures, 2 tables)

This paper contains 13 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: CopilotCAD integrates the computational efficiency of AI and expertise of radiologists and provides an friendly interface to facilitate interactive image based diagnosis, enabling radiologists to make informed decisions supported by AI-generated quantitative data and visual aids with enhanced explainability, transparency, and safety, reflecting a paradigm shift away from traditional CAD systems.
  • Figure 2: Overall architecture of CopilotCAD, consisting of LLMs, medical image analysis models, and an interface to facilitate interactive cross-communication between the AI systems and human experts.
  • Figure 3: Illustration of the data organization for training CopilotCAD, including imaging data, text data, and data cleaning and organization.
  • Figure 4: Overview of our in-house CTU report dataset. (a) Histogram of patient ages, spanning from young adults to elderly individuals. (b) Distribution of report lengths, with the majority falling within a certain character range. (c) Ethnicity breakdown, predominantly White. (d) Gender split between Female and Male. (e) Frequency of mentions for various abdominal organs, highlighting the most commonly discussed structures. (f) Frequency of pathological findings and abnormalities, covering a diverse range of medical conditions. In (e-f) dark green means these part can be recognized by our image analysis model (Total Segmentor).
  • Figure 5: Report completion compared with GPT3. Input are display in black and the suggested completion are display in blue.
  • ...and 1 more figures