Table of Contents
Fetching ...

DDO: Dual-Decision Optimization for LLM-Based Medical Consultation via Multi-Agent Collaboration

Zhihao Jia, Mingyi Jia, Junwen Duan, Jianxin Wang

TL;DR

DDO tackles the dual nature of medical consultation by decoupling symptom inquiry (sequential, high-dimensional) from disease diagnosis (classification) and solving them with a four-agent, memory-guided workflow. A Diagnosis Agent provides confidence via Binary Token Probability with a calibrated, in-batch contrastive adapter; a Policy Agent coupled with an Inquiry Agent uses masked sampling and PPO-based RL to generate informative candidate actions, while a Patient Agent simulates responses. Across three real MC datasets, DDO consistently outperforms other LLM-based methods and matches state-of-the-art generation-based approaches, while requiring far less training overhead. This modular, transparent design improves diagnostic accuracy and offers practical pathways for scalable MC with constrained interaction turns.

Abstract

Large Language Models (LLMs) demonstrate strong generalization and reasoning abilities, making them well-suited for complex decision-making tasks such as medical consultation (MC). However, existing LLM-based methods often fail to capture the dual nature of MC, which entails two distinct sub-tasks: symptom inquiry, a sequential decision-making process, and disease diagnosis, a classification problem. This mismatch often results in ineffective symptom inquiry and unreliable disease diagnosis. To address this, we propose \textbf{DDO}, a novel LLM-based framework that performs \textbf{D}ual-\textbf{D}ecision \textbf{O}ptimization by decoupling the two sub-tasks and optimizing them with distinct objectives through a collaborative multi-agent workflow. Experiments on three real-world MC datasets show that DDO consistently outperforms existing LLM-based approaches and achieves competitive performance with state-of-the-art generation-based methods, demonstrating its effectiveness in the MC task. The code is available at https://github.com/zh-jia/DDO.

DDO: Dual-Decision Optimization for LLM-Based Medical Consultation via Multi-Agent Collaboration

TL;DR

DDO tackles the dual nature of medical consultation by decoupling symptom inquiry (sequential, high-dimensional) from disease diagnosis (classification) and solving them with a four-agent, memory-guided workflow. A Diagnosis Agent provides confidence via Binary Token Probability with a calibrated, in-batch contrastive adapter; a Policy Agent coupled with an Inquiry Agent uses masked sampling and PPO-based RL to generate informative candidate actions, while a Patient Agent simulates responses. Across three real MC datasets, DDO consistently outperforms other LLM-based methods and matches state-of-the-art generation-based approaches, while requiring far less training overhead. This modular, transparent design improves diagnostic accuracy and offers practical pathways for scalable MC with constrained interaction turns.

Abstract

Large Language Models (LLMs) demonstrate strong generalization and reasoning abilities, making them well-suited for complex decision-making tasks such as medical consultation (MC). However, existing LLM-based methods often fail to capture the dual nature of MC, which entails two distinct sub-tasks: symptom inquiry, a sequential decision-making process, and disease diagnosis, a classification problem. This mismatch often results in ineffective symptom inquiry and unreliable disease diagnosis. To address this, we propose \textbf{DDO}, a novel LLM-based framework that performs \textbf{D}ual-\textbf{D}ecision \textbf{O}ptimization by decoupling the two sub-tasks and optimizing them with distinct objectives through a collaborative multi-agent workflow. Experiments on three real-world MC datasets show that DDO consistently outperforms existing LLM-based approaches and achieves competitive performance with state-of-the-art generation-based methods, demonstrating its effectiveness in the MC task. The code is available at https://github.com/zh-jia/DDO.

Paper Structure

This paper contains 38 sections, 6 equations, 4 figures, 17 tables.

Figures (4)

  • Figure 1: An example of a Medical Consultation (MC) task, where an AI doctor iteratively inquires about additional symptoms based on the patient's initial self-reported symptoms and ultimately provides a diagnosis.
  • Figure 2: Overview of the proposed DDO framework, comprising four collaborative agents operating over a shared memory to execute the consultation workflow: the Diagnosis Agent estimates disease confidences from LLM logits; the Policy Agent generates candidate actions via masked sampling; the Inquiry Agent selects the optimal symptom to query or terminates the consultation; and the Patient Agent responds based on the patient profile.
  • Figure 3: Effect of max turns $L$.
  • Figure 4: Diagnosis performance at the disease level on the GMD dataset.