Multi-Agent Simulator Drives Language Models for Legal Intensive Interaction

Shengbin Yue; Ting Huang; Zheng Jia; Siyuan Wang; Shujun Liu; Yun Song; Xuanjing Huang; Zhongyu Wei

Multi-Agent Simulator Drives Language Models for Legal Intensive Interaction

Shengbin Yue, Ting Huang, Zheng Jia, Siyuan Wang, Shujun Liu, Yun Song, Xuanjing Huang, Zhongyu Wei

TL;DR

This work tackles the scarcity of scalable, interactive legal scenario data by introducing MASER, a Multi-agent Legal Simulation Driver that coordinates role-preserving data generation across Client, Lawyer, and Supervisor agents. It combines real-world legal sources with Big-5 personality-based role presets and a sentence-level supervision mechanism to produce authentic, distractor-aware social simulations, culminating in SynthLaw, a large synthetic dataset for fine-tuning LLMs. The accompanying MILE benchmark evaluates LLM-driven lawyers in dynamic, goal-oriented tasks, using a two-phase assessment of interaction quality and final complaint quality, derived from real judgments. Experimental results show SynthLaw markedly improves interactive and goal-oriented performance over baselines, bridging the gap between intensive interaction and legal task achievement, with strong robustness across different client profiles and base models. The framework promises scalable, domain-specific data generation for advanced legal AI systems and can be extended to more complex proceedings and consultative contexts.

Abstract

Large Language Models (LLMs) have significantly advanced legal intelligence, but the scarcity of scenario data impedes the progress toward interactive legal scenarios. This paper introduces a Multi-agent Legal Simulation Driver (MASER) to scalably generate synthetic data by simulating interactive legal scenarios. Leveraging real-legal case sources, MASER ensures the consistency of legal attributes between participants and introduces a supervisory mechanism to align participants' characters and behaviors as well as addressing distractions. A Multi-stage Interactive Legal Evaluation (MILE) benchmark is further constructed to evaluate LLMs' performance in dynamic legal scenarios. Extensive experiments confirm the effectiveness of our framework.

Multi-Agent Simulator Drives Language Models for Legal Intensive Interaction

TL;DR

Abstract

Multi-Agent Simulator Drives Language Models for Legal Intensive Interaction

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (25)