BianQue: Balancing the Questioning and Suggestion Ability of Health LLMs with Multi-turn Health Conversations Polished by ChatGPT

Yirong Chen; Zhenyu Wang; Xiaofen Xing; huimin zheng; Zhipei Xu; Kai Fang; Junhong Wang; Sihang Li; Jieling Wu; Qi Liu; Xiangmin Xu

BianQue: Balancing the Questioning and Suggestion Ability of Health LLMs with Multi-turn Health Conversations Polished by ChatGPT

Yirong Chen, Zhenyu Wang, Xiaofen Xing, huimin zheng, Zhipei Xu, Kai Fang, Junhong Wang, Sihang Li, Jieling Wu, Qi Liu, Xiangmin Xu

TL;DR

<3-5 sentence high-level summary> The paper addresses the gap that health LLMs typically excel at single-turn suggestions but struggle with proactive, multi-turn questioning (CoQ) essential for personalized advice. It introduces BianQue, a CoQ-balanced health LLM finetuned on the large-scale BianQueCorpus, created by cleaning real-world conversations and polishing doctor responses with ChatGPT. The model is built on ChatGLM-6B and uses a structured, multi-turn input format to generate alternating questions and health suggestions. Across four multi-turn health-dialogue benchmarks, BianQue outperforms baselines in both questioning ability and suggestion quality, as measured by BLEU/ROUGE and a Proactive Questioning Ability (PQA) metric, highlighting its potential for proactive health conversations while noting safety and privacy considerations for future work.

Abstract

Large language models (LLMs) have performed well in providing general and extensive health suggestions in single-turn conversations, exemplified by systems such as ChatGPT, ChatGLM, ChatDoctor, DoctorGLM, and etc. However, the limited information provided by users during single turn results in inadequate personalization and targeting of the generated suggestions, which requires users to independently select the useful part. It is mainly caused by the missing ability to engage in multi-turn questioning. In real-world medical consultations, doctors usually employ a series of iterative inquiries to comprehend the patient's condition thoroughly, enabling them to provide effective and personalized suggestions subsequently, which can be defined as chain of questioning (CoQ) for LLMs. To improve the CoQ of LLMs, we propose BianQue, a ChatGLM-based LLM finetuned with the self-constructed health conversation dataset BianQueCorpus that is consist of multiple turns of questioning and health suggestions polished by ChatGPT. Experimental results demonstrate that the proposed BianQue can simultaneously balance the capabilities of both questioning and health suggestions, which will help promote the research and application of LLMs in the field of proactive health.

BianQue: Balancing the Questioning and Suggestion Ability of Health LLMs with Multi-turn Health Conversations Polished by ChatGPT

TL;DR

Abstract

Paper Structure (12 sections, 2 equations, 8 figures, 1 table)

This paper contains 12 sections, 2 equations, 8 figures, 1 table.

Introduction
Methodology
BianQueCorpus: Balancing Questioning and Suggestion
BianQue Model
Experiments
Baselines and Benchmarks
Implementation details
Results and Analysis
Conclusion and Future Work
Reproducibility Checklist
Sample Conversations of LLMs
Sample Conversations of BianQue

Figures (8)

Figure 1: Example of chain of questioning (CoQ). The sentence in red font presents the doctor's CoQ: a series of questions about cough time, cough sound, sputum status, fever status, examination, medication and treatment.
Figure 2: Proportion of questions and suggestions in answers of BianQueCorpus.
Figure 3: Construction process of BianQueCorpus dataset and BianQue model.
Figure 4: The prompt used for polishing the suggestions of doctors based on real-world multi-turn conversation context.
Figure 5: A case of a user confiding to ChatGPT.
...and 3 more figures

BianQue: Balancing the Questioning and Suggestion Ability of Health LLMs with Multi-turn Health Conversations Polished by ChatGPT

TL;DR

Abstract

BianQue: Balancing the Questioning and Suggestion Ability of Health LLMs with Multi-turn Health Conversations Polished by ChatGPT

Authors

TL;DR

Abstract

Table of Contents

Figures (8)