Effectiveness of ChatGPT in explaining complex medical reports to patients
Mengxuan Sun, Ehud Reiter, Anne E Kiltie, George Ramsay, Lisa Duncan, Peter Murchie, Rosalind Adam
TL;DR
This study evaluates ChatGPT-4's ability to explain dense cancer MDT reports to patients using six fictitious MDTs (colorectal and prostate) and four prompt scenarios. It employs a mixed-methods design with a pilot review, lay and clinician annotations, and focus-group workshops to examine accuracy, language, content, trust, and workflow integration. The findings reveal pervasive issues in accuracy, language appropriateness, and clinical relevance, leading to caution against clinical deployment without governance, personalization, and clinician validation. The work highlights barriers to adoption and outlines concrete directions for improving AI-assisted explanations in oncology through prompt design, safety checks, and alignment with healthcare workflows.
Abstract
Electronic health records contain detailed information about the medical condition of patients, but they are difficult for patients to understand even if they have access to them. We explore whether ChatGPT (GPT 4) can help explain multidisciplinary team (MDT) reports to colorectal and prostate cancer patients. These reports are written in dense medical language and assume clinical knowledge, so they are a good test of the ability of ChatGPT to explain complex medical reports to patients. We asked clinicians and lay people (not patients) to review explanations and responses of ChatGPT. We also ran three focus groups (including cancer patients, caregivers, computer scientists, and clinicians) to discuss output of ChatGPT. Our studies highlighted issues with inaccurate information, inappropriate language, limited personalization, AI distrust, and challenges integrating large language models (LLMs) into clinical workflow. These issues will need to be resolved before LLMs can be used to explain complex personal medical information to patients.
