BP4ER: Bootstrap Prompting for Explicit Reasoning in Medical Dialogue Generation
Yuhong He, Yongqi Zhang, Shizhu He, Jun Wan
TL;DR
BP4ER tackles the interpretability gap in medical dialogue generation by recasting MDG as a multi-step reasoning task and generating explicit intermediate reasoning chains. It combines least-to-most prompting with two bootstrapping strategies (AP-Bootstrap and PR-Bootstrap) to autonomously correct reasoning and iteratively improve the model without heavy entity annotation. The approach decomposes dialogue into three steps—patient state tracking, next diagnosis decision, and physician response—guided by demonstration prompts, and refines reasoning through filtered, bootstrapped data. Experimental results on MedDG and KaMed show consistent improvements over strong baselines in both automatic metrics and human judgments, supporting the value of explicit reasoning for MDG and its potential to enhance transparency and reliability in clinical dialogue systems.
Abstract
Medical dialogue generation (MDG) has gained increasing attention due to its substantial practical value. Previous works typically employ a sequence-to-sequence framework to generate medical responses by modeling dialogue context as sequential text with annotated medical entities. While these methods have been successful in generating fluent responses, they fail to provide process explanations of reasoning and require extensive entity annotation. To address these limitations, we propose the method Bootstrap Prompting for Explicit Reasoning in MDG (BP4ER), which explicitly model MDG's multi-step reasoning process and iteratively enhance this reasoning process. We employ a least-to-most prompting strategy to guide a large language model (LLM) in explicit reasoning, breaking down MDG into simpler sub-questions. These sub-questions build on answers from previous ones. Additionally, we also introduce two distinct bootstrapping techniques for prompting, which autonomously correct errors and facilitate the LLM's explicit reasoning. This approach eliminates the need for entity annotation and increases the transparency of the MDG process by explicitly generating the intermediate reasoning chain. The experimental findings on the two public datasets indicate that BP4ER outperforms state-of-the-art methods in terms of both objective and subjective evaluation metrics.
