Table of Contents
Fetching ...

Follow-up Question Generation For Enhanced Patient-Provider Conversations

Joseph Gatto, Parker Seegmiller, Timothy Burdick, Inas S. Khayal, Sarah DeLozier, Sarah M. Preum

TL;DR

This work tackles the challenge of generating informative follow-up questions in asynchronous patient–provider conversations by modeling multiple clinical thought processes. It introduces FollowupQ, a multi-agent framework that combines EHR reasoning, differential diagnostics, and message clarifications to produce a pool of follow-up questions from a patient message and linked EHR data, followed by filtration to a usable set. A new dataset, FollowupBench, with real and synthetic asynchronous medical messages and 2,300 expert-authored questions, enables evaluation via the Requested Information Match (RIM) metric and an LLM-as-Judge for semantic question matching. Results show FollowupQ substantially improves coverage over baselines and reduces provider workload (e.g., ~34% fewer follow-up messages required) while providing interpretable agent-level insights, signaling practical value for telehealth workflows and NLP research on information-seeking in healthcare.

Abstract

Follow-up question generation is an essential feature of dialogue systems as it can reduce conversational ambiguity and enhance modeling complex interactions. Conversational contexts often pose core NLP challenges such as (i) extracting relevant information buried in fragmented data sources, and (ii) modeling parallel thought processes. These two challenges occur frequently in medical dialogue as a doctor asks questions based not only on patient utterances but also their prior EHR data and current diagnostic hypotheses. Asking medical questions in asynchronous conversations compounds these issues as doctors can only rely on static EHR information to motivate follow-up questions. To address these challenges, we introduce FollowupQ, a novel framework for enhancing asynchronous medical conversation. FollowupQ is a multi-agent framework that processes patient messages and EHR data to generate personalized follow-up questions, clarifying patient-reported medical conditions. FollowupQ reduces requisite provider follow-up communications by 34%. It also improves performance by 17% and 5% on real and synthetic data, respectively. We also release the first public dataset of asynchronous medical messages with linked EHR data alongside 2,300 follow-up questions written by clinical experts for the wider NLP research community.

Follow-up Question Generation For Enhanced Patient-Provider Conversations

TL;DR

This work tackles the challenge of generating informative follow-up questions in asynchronous patient–provider conversations by modeling multiple clinical thought processes. It introduces FollowupQ, a multi-agent framework that combines EHR reasoning, differential diagnostics, and message clarifications to produce a pool of follow-up questions from a patient message and linked EHR data, followed by filtration to a usable set. A new dataset, FollowupBench, with real and synthetic asynchronous medical messages and 2,300 expert-authored questions, enables evaluation via the Requested Information Match (RIM) metric and an LLM-as-Judge for semantic question matching. Results show FollowupQ substantially improves coverage over baselines and reduces provider workload (e.g., ~34% fewer follow-up messages required) while providing interpretable agent-level insights, signaling practical value for telehealth workflows and NLP research on information-seeking in healthcare.

Abstract

Follow-up question generation is an essential feature of dialogue systems as it can reduce conversational ambiguity and enhance modeling complex interactions. Conversational contexts often pose core NLP challenges such as (i) extracting relevant information buried in fragmented data sources, and (ii) modeling parallel thought processes. These two challenges occur frequently in medical dialogue as a doctor asks questions based not only on patient utterances but also their prior EHR data and current diagnostic hypotheses. Asking medical questions in asynchronous conversations compounds these issues as doctors can only rely on static EHR information to motivate follow-up questions. To address these challenges, we introduce FollowupQ, a novel framework for enhancing asynchronous medical conversation. FollowupQ is a multi-agent framework that processes patient messages and EHR data to generate personalized follow-up questions, clarifying patient-reported medical conditions. FollowupQ reduces requisite provider follow-up communications by 34%. It also improves performance by 17% and 5% on real and synthetic data, respectively. We also release the first public dataset of asynchronous medical messages with linked EHR data alongside 2,300 follow-up questions written by clinical experts for the wider NLP research community.

Paper Structure

This paper contains 43 sections, 9 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: We demonstrate the core challenges of asynchronous follow-up question generation. Providers need to consider complex fragmented data sources to generate multiple follow-up questions reflecting parallel thought processes.
  • Figure 2: FollowupQ works by taking a patient message and a subset of their EHR and employing multiple LLM agents to explore diverse clinical thought processes --- producing a pool of follow-up questions from different perspectives. If desired, FollowupQ can then filter the output to a controllable question set size.
  • Figure 3: Per-Agent Performance on FB-Real from FollowupQ (Llama3-8b). We find that most of the performance comes from agents trying to rule-out the worst case scenario for a patient.
  • Figure 4: Visualization of the data collection process for FollowupBench-Real.
  • Figure 5: Example sample used in grounded generation. The goal is to map all non-message features to a novel synthetic message based on randomly sampled features. We provide few-shot examples in-context to guide the generation.
  • ...and 2 more figures