Table of Contents
Fetching ...

BP4ER: Bootstrap Prompting for Explicit Reasoning in Medical Dialogue Generation

Yuhong He, Yongqi Zhang, Shizhu He, Jun Wan

TL;DR

BP4ER tackles the interpretability gap in medical dialogue generation by recasting MDG as a multi-step reasoning task and generating explicit intermediate reasoning chains. It combines least-to-most prompting with two bootstrapping strategies (AP-Bootstrap and PR-Bootstrap) to autonomously correct reasoning and iteratively improve the model without heavy entity annotation. The approach decomposes dialogue into three steps—patient state tracking, next diagnosis decision, and physician response—guided by demonstration prompts, and refines reasoning through filtered, bootstrapped data. Experimental results on MedDG and KaMed show consistent improvements over strong baselines in both automatic metrics and human judgments, supporting the value of explicit reasoning for MDG and its potential to enhance transparency and reliability in clinical dialogue systems.

Abstract

Medical dialogue generation (MDG) has gained increasing attention due to its substantial practical value. Previous works typically employ a sequence-to-sequence framework to generate medical responses by modeling dialogue context as sequential text with annotated medical entities. While these methods have been successful in generating fluent responses, they fail to provide process explanations of reasoning and require extensive entity annotation. To address these limitations, we propose the method Bootstrap Prompting for Explicit Reasoning in MDG (BP4ER), which explicitly model MDG's multi-step reasoning process and iteratively enhance this reasoning process. We employ a least-to-most prompting strategy to guide a large language model (LLM) in explicit reasoning, breaking down MDG into simpler sub-questions. These sub-questions build on answers from previous ones. Additionally, we also introduce two distinct bootstrapping techniques for prompting, which autonomously correct errors and facilitate the LLM's explicit reasoning. This approach eliminates the need for entity annotation and increases the transparency of the MDG process by explicitly generating the intermediate reasoning chain. The experimental findings on the two public datasets indicate that BP4ER outperforms state-of-the-art methods in terms of both objective and subjective evaluation metrics.

BP4ER: Bootstrap Prompting for Explicit Reasoning in Medical Dialogue Generation

TL;DR

BP4ER tackles the interpretability gap in medical dialogue generation by recasting MDG as a multi-step reasoning task and generating explicit intermediate reasoning chains. It combines least-to-most prompting with two bootstrapping strategies (AP-Bootstrap and PR-Bootstrap) to autonomously correct reasoning and iteratively improve the model without heavy entity annotation. The approach decomposes dialogue into three steps—patient state tracking, next diagnosis decision, and physician response—guided by demonstration prompts, and refines reasoning through filtered, bootstrapped data. Experimental results on MedDG and KaMed show consistent improvements over strong baselines in both automatic metrics and human judgments, supporting the value of explicit reasoning for MDG and its potential to enhance transparency and reliability in clinical dialogue systems.

Abstract

Medical dialogue generation (MDG) has gained increasing attention due to its substantial practical value. Previous works typically employ a sequence-to-sequence framework to generate medical responses by modeling dialogue context as sequential text with annotated medical entities. While these methods have been successful in generating fluent responses, they fail to provide process explanations of reasoning and require extensive entity annotation. To address these limitations, we propose the method Bootstrap Prompting for Explicit Reasoning in MDG (BP4ER), which explicitly model MDG's multi-step reasoning process and iteratively enhance this reasoning process. We employ a least-to-most prompting strategy to guide a large language model (LLM) in explicit reasoning, breaking down MDG into simpler sub-questions. These sub-questions build on answers from previous ones. Additionally, we also introduce two distinct bootstrapping techniques for prompting, which autonomously correct errors and facilitate the LLM's explicit reasoning. This approach eliminates the need for entity annotation and increases the transparency of the MDG process by explicitly generating the intermediate reasoning chain. The experimental findings on the two public datasets indicate that BP4ER outperforms state-of-the-art methods in terms of both objective and subjective evaluation metrics.
Paper Structure (24 sections, 2 equations, 3 figures, 3 tables)

This paper contains 24 sections, 2 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Paradigm comparison in MDG: prior works adopt a Seq2Seq framework (a); our model (b) explicitly incorporates a multi-step reasoning process and reduces entity annotation.
  • Figure 2: Overview of BP4ER. Medical dialogue is deconstructed into a reasoning chain of sub-questions. Demonstration prompts guide intermediate reasoning, sequentially querying the LLM. Two bootstrapping techniques for prompting, AP-Bootstrap and RP-Bootstrap, are introduced to enhance explicit reasoning.
  • Figure 3: A case study on comparative responses generated from various models, where "P" represents patient descriptions and "R" represents system responses.