Table of Contents
Fetching ...

From Medical Records to Diagnostic Dialogues: A Clinical-Grounded Approach and Dataset for Psychiatric Comorbidity

Tianxi Wan, Jiaming Luo, Siyuan Chen, Kunyao Lan, Jianhua Chen, Haiyang Geng, Mengyue Wu

TL;DR

This work tackles the challenge of diagnosing psychiatric comorbidity by building a large-scale, clinically grounded dialogue dataset. It introduces a two-stage pipeline: first, PsyCoProfile converts social-media self-reports into 502 structured EMRs aligned with SCID-5-RV; second, PsyCoTalk uses a three-agent framework with Hierarchical Diagnostic State Machines and a Diagnostic Context Tree to generate 3,000 multi-turn dialogues, validated by psychiatrists. The dataset exhibits high structural fidelity and realism, enabling evaluation of multi-disorder screening in a single conversational pass and achieving improved diagnostic accuracy over baseline zero-shot systems. The resource advances research in psychiatric comorbidity by enabling scalable training and rigorous evaluation of inference and reasoning in DSM-aligned dialogues, with potential impact on clinical decision-support and resource-constrained settings.

Abstract

Psychiatric comorbidity is clinically significant yet challenging due to the complexity of multiple co-occurring disorders. To address this, we develop a novel approach integrating synthetic patient electronic medical record (EMR) construction and multi-agent diagnostic dialogue generation. We create 502 synthetic EMRs for common comorbid conditions using a pipeline that ensures clinical relevance and diversity. Our multi-agent framework transfers the clinical interview protocol into a hierarchical state machine and context tree, supporting over 130 diagnostic states while maintaining clinical standards. Through this rigorous process, we construct PsyCoTalk, the first large-scale dialogue dataset supporting comorbidity, containing 3,000 multi-turn diagnostic dialogues validated by psychiatrists. This dataset enhances diagnostic accuracy and treatment planning, offering a valuable resource for psychiatric comorbidity research. Compared to real-world clinical transcripts, PsyCoTalk exhibits high structural and linguistic fidelity in terms of dialogue length, token distribution, and diagnostic reasoning strategies. Licensed psychiatrists confirm the realism and diagnostic validity of the dialogues. This dataset enables the development and evaluation of models capable of multi-disorder psychiatric screening in a single conversational pass.

From Medical Records to Diagnostic Dialogues: A Clinical-Grounded Approach and Dataset for Psychiatric Comorbidity

TL;DR

This work tackles the challenge of diagnosing psychiatric comorbidity by building a large-scale, clinically grounded dialogue dataset. It introduces a two-stage pipeline: first, PsyCoProfile converts social-media self-reports into 502 structured EMRs aligned with SCID-5-RV; second, PsyCoTalk uses a three-agent framework with Hierarchical Diagnostic State Machines and a Diagnostic Context Tree to generate 3,000 multi-turn dialogues, validated by psychiatrists. The dataset exhibits high structural fidelity and realism, enabling evaluation of multi-disorder screening in a single conversational pass and achieving improved diagnostic accuracy over baseline zero-shot systems. The resource advances research in psychiatric comorbidity by enabling scalable training and rigorous evaluation of inference and reasoning in DSM-aligned dialogues, with potential impact on clinical decision-support and resource-constrained settings.

Abstract

Psychiatric comorbidity is clinically significant yet challenging due to the complexity of multiple co-occurring disorders. To address this, we develop a novel approach integrating synthetic patient electronic medical record (EMR) construction and multi-agent diagnostic dialogue generation. We create 502 synthetic EMRs for common comorbid conditions using a pipeline that ensures clinical relevance and diversity. Our multi-agent framework transfers the clinical interview protocol into a hierarchical state machine and context tree, supporting over 130 diagnostic states while maintaining clinical standards. Through this rigorous process, we construct PsyCoTalk, the first large-scale dialogue dataset supporting comorbidity, containing 3,000 multi-turn diagnostic dialogues validated by psychiatrists. This dataset enhances diagnostic accuracy and treatment planning, offering a valuable resource for psychiatric comorbidity research. Compared to real-world clinical transcripts, PsyCoTalk exhibits high structural and linguistic fidelity in terms of dialogue length, token distribution, and diagnostic reasoning strategies. Licensed psychiatrists confirm the realism and diagnostic validity of the dialogues. This dataset enables the development and evaluation of models capable of multi-disorder psychiatric screening in a single conversational pass.

Paper Structure

This paper contains 22 sections, 2 equations, 14 figures, 6 tables.

Figures (14)

  • Figure 1: Framework overview: 1) EMR: construct patient profiles with 6 comorbidity types in electronic medical record structure by extracting social media posts of users self-reporting multiple disorders; 2) Dialogue: build a multi-agent framework with hierarchical state machines, based on SCID-5-RV SCID-5-RV, a standardized and semi-structured interview guide for major disorders assessment, to construct a comorbidity-focused diagnostic dialogue dataset PsyCoTalk.
  • Figure 2: Overview of the EMR generation pipeline. Starting from social media posts, different modules of the EMR are generated using distinct methods, with the overall process proceeding in a top-down manner. The LLM used for EMR generation is GPT-4o-mini Gpt-4Report.
  • Figure 3: Comparisons between synthetic EMRs and real-world data.
  • Figure 4: Multi-agent Diagnosis Framework. LLM-based doctor and patient agents interact under a tool agent that manages dialogue flow and diagnostic state transitions. The tool agent combines LLM and rule-based logic to generate patient experiences and regulate the Dynamic Diagnostic Tree.
  • Figure 5: MDD sub-state machine. Solid nodes denote topic-specific questions. Colored arrows represent binary responses (present vs. absent), i.e., whether the patient exhibits the symptom. Groups A00 and A01 are activated when $\geq 5$ "present" responses are observed; otherwise, they follow the alternative transition.
  • ...and 9 more figures