ReXplain: Translating Radiology into Patient-Friendly Video Reports

Luyang Luo; Jenanan Vairavamurthy; Xiaoman Zhang; Abhinav Kumar; Ramon R. Ter-Oganesyan; Stuart T. Schroff; Dan Shilo; Rydhwana Hossain; Mike Moritz; Pranav Rajpurkar

ReXplain: Translating Radiology into Patient-Friendly Video Reports

Luyang Luo, Jenanan Vairavamurthy, Xiaoman Zhang, Abhinav Kumar, Ramon R. Ter-Oganesyan, Stuart T. Schroff, Dan Shilo, Rydhwana Hossain, Mike Moritz, Pranav Rajpurkar

TL;DR

ReXplain proposes an end-to-end multimodal pipeline that translates radiology findings into patient-friendly video reports by integrating an LLM for lay-language translation and image-anatomy association, a segmentation model for highlighting anatomy, and an avatar-based visualization. By grounding textual explanations to specific image regions, providing normal-reference comparisons, and delivering narrated explanations via a virtual radiologist, it aims to enhance patient understanding while reducing clinician workload. In a proof-of-concept with six radiologists, the system showed promising accuracy and practical potential for pre-consultation education and improved patient engagement. The work lays groundwork for AI-assisted, patient-centered radiology reporting and identifies clear directions for precision lesion-grounding and safety considerations in future clinical deployment.

Abstract

Radiology reports, designed for efficient communication between medical experts, often remain incomprehensible to patients. This inaccessibility could potentially lead to anxiety, decreased engagement in treatment decisions, and poorer health outcomes, undermining patient-centered care. We present ReXplain (Radiology eXplanation), an innovative AI-driven system that translates radiology findings into patient-friendly video reports. ReXplain uniquely integrates a large language model for medical text simplification and text-anatomy association, an image segmentation model for anatomical region identification, and an avatar generation tool for engaging interface visualization. ReXplain enables producing comprehensive explanations with plain language, highlighted imagery, and 3D organ renderings in the form of video reports. To evaluate the utility of ReXplain-generated explanations, we conducted two rounds of user feedback collection from six board-certified radiologists. The results of this proof-of-concept study indicate that ReXplain could accurately deliver radiological information and effectively simulate one-on-one consultation, shedding light on enhancing patient-centered radiology with potential clinical usage. This work demonstrates a new paradigm in AI-assisted medical communication, potentially improving patient engagement and satisfaction in radiology care, and opens new avenues for research in multimodal medical communication.

ReXplain: Translating Radiology into Patient-Friendly Video Reports

TL;DR

Abstract

Paper Structure (29 sections, 4 figures, 1 table)

This paper contains 29 sections, 4 figures, 1 table.

Introduction
Related Works
Designing ReXplain
Sketching Video Reports Concepts.
Interpreting Radiology Reports with Lay-language.
Connecting Image Regions with Reports.
Comparing with a Healthy Individual.
Generating Avatar Explainer.
Generating Patient-friendly Video Reports
Integrating Key Elements into Video Reports.
Exemplifying a Typical Video Report.
Technical Performance
Eliciting User Feedback
Materials.
Ethics Statement
...and 14 more sections

Figures (4)

Figure 1: Key elements of ReXplain-generated video reports.a. The image of the patient with the key finding, calcification in the aorta, highlighted in a bounding box; b. A normal reference image registered with the query image, and the aorta is also highlighted; c. The reconstruction rendering of the aorta; d. A talking Avatar explaining the text reports with lay-language.
Figure 2: Illustration of the ReXplain pipeline.a. GPT-4o is first used to generate lay-language explanation from the text reports. It is also used to map the findings to the corresponding organs; b. Avatar generation based on text to speech generation and Gaussian Splatting, which simulates a one-on-one communication interface; c. A universal CT organ segmentation model is used to highlight the anatomy of interest according to the organ extracted from the previous steps; d. The final video report combining the key elements generated from the previous steps.
Figure 3: Comparison between a conventional written report and an illustration of a video report generated by ReXplain.a. The original report describes both negative and positive findings. b. The video report highlights positive findings with grounded image regions, emphasizes comparison with normal images, renders the holistic organ structure, and explains the findings using lay language with avatar explainer. A sample video report can be found in the supplementary.
Figure 4: Round 1 user study results. Questions are sorted by the descending order of the percentage of the combination of "strongly agree" and "agree". Neutral = "neither agree or disagree".

ReXplain: Translating Radiology into Patient-Friendly Video Reports

TL;DR

Abstract

ReXplain: Translating Radiology into Patient-Friendly Video Reports

Authors

TL;DR

Abstract

Table of Contents

Figures (4)