Follow-up Question Generation For Enhanced Patient-Provider Conversations
Joseph Gatto, Parker Seegmiller, Timothy Burdick, Inas S. Khayal, Sarah DeLozier, Sarah M. Preum
TL;DR
This work tackles the challenge of generating informative follow-up questions in asynchronous patient–provider conversations by modeling multiple clinical thought processes. It introduces FollowupQ, a multi-agent framework that combines EHR reasoning, differential diagnostics, and message clarifications to produce a pool of follow-up questions from a patient message and linked EHR data, followed by filtration to a usable set. A new dataset, FollowupBench, with real and synthetic asynchronous medical messages and 2,300 expert-authored questions, enables evaluation via the Requested Information Match (RIM) metric and an LLM-as-Judge for semantic question matching. Results show FollowupQ substantially improves coverage over baselines and reduces provider workload (e.g., ~34% fewer follow-up messages required) while providing interpretable agent-level insights, signaling practical value for telehealth workflows and NLP research on information-seeking in healthcare.
Abstract
Follow-up question generation is an essential feature of dialogue systems as it can reduce conversational ambiguity and enhance modeling complex interactions. Conversational contexts often pose core NLP challenges such as (i) extracting relevant information buried in fragmented data sources, and (ii) modeling parallel thought processes. These two challenges occur frequently in medical dialogue as a doctor asks questions based not only on patient utterances but also their prior EHR data and current diagnostic hypotheses. Asking medical questions in asynchronous conversations compounds these issues as doctors can only rely on static EHR information to motivate follow-up questions. To address these challenges, we introduce FollowupQ, a novel framework for enhancing asynchronous medical conversation. FollowupQ is a multi-agent framework that processes patient messages and EHR data to generate personalized follow-up questions, clarifying patient-reported medical conditions. FollowupQ reduces requisite provider follow-up communications by 34%. It also improves performance by 17% and 5% on real and synthetic data, respectively. We also release the first public dataset of asynchronous medical messages with linked EHR data alongside 2,300 follow-up questions written by clinical experts for the wider NLP research community.
