Toward Human Centered Interactive Clinical Question Answering System
Dina Albassam
TL;DR
The paper addresses the challenge of extracting actionable information from unstructured clinical notes in EHRs. It presents an interactive clinical QA system that uses zero-shot prompting on OpenAI models to produce extractive, verbatim answer spans highlighted within notes. Evaluation combines lexical/semantic metrics on the emrQA-msquad dataset and an AI-persona usability study, revealing strong semantic alignment and generally usable interfaces with room for improving explanations. The work contributes a physician-centered design, an end-to-end implementation, and an automated usability evaluation framework to bridge QA advances with clinical workflow needs and adoption.
Abstract
Unstructured clinical notes contain essential patient information but are challenging for physicians to search and interpret efficiently. Although large language models (LLMs) have shown promise in question answering (QA), most existing systems lack transparency, usability, and alignment with clinical workflows. This work introduces an interactive QA system that enables physicians to query clinical notes via text or voice and receive extractive answers highlighted directly in the note for traceability. The system was built using OpenAI models with zero-shot prompting and evaluated across multiple metrics, including exact string match, word overlap, SentenceTransformer similarity, and BERTScore. Results show that while exact match scores ranged from 47 to 62 percent, semantic similarity scores exceeded 87 percent, indicating strong contextual alignment even when wording varied. To assess usability, the system was also evaluated using simulated clinical personas. Seven diverse physician and nurse personas interacted with the system across scenario-based tasks and provided structured feedback. The evaluations highlighted strengths in intuitive design and answer accessibility, alongside opportunities for enhancing explanation clarity.
