Table of Contents
Fetching ...

ClaimGen-CN: A Large-scale Chinese Dataset for Legal Claim Generation

Siying Zhou, Yiquan Wu, Hui Chen, Xavier Hu, Kun Kuang, Adam Jatowt, Ming Hu, Chunyan Zheng, Fei Wu

TL;DR

The paper introduces ClaimGen-CN, a large-scale Chinese dataset for legal claim generation derived from 207,748 civil judgments, and defines two evaluation axes—factuality and clarity—to assess generated claims. It formalizes the task as generating a set of claims $C$ from the facts $f$ and grounds the dataset in real-world civil disputes with broad cause-of-action coverage. A zero-shot evaluation of six leading LLMs shows notable gaps between automatic metrics and human judgments, with model strengths varying across factuality, clarity, and legal grounding. The work highlights key limitations in current models and offers practical guidance for improving non-expert legal support, including future work on long-chain reasoning and human-in-the-loop validation. Overall, ClaimGen-CN provides a crucial resource for advancing accessible legal claim generation and evaluating model performance in a domain with high societal impact.

Abstract

Legal claims refer to the plaintiff's demands in a case and are essential to guiding judicial reasoning and case resolution. While many works have focused on improving the efficiency of legal professionals, the research on helping non-professionals (e.g., plaintiffs) remains unexplored. This paper explores the problem of legal claim generation based on the given case's facts. First, we construct ClaimGen-CN, the first dataset for Chinese legal claim generation task, from various real-world legal disputes. Additionally, we design an evaluation metric tailored for assessing the generated claims, which encompasses two essential dimensions: factuality and clarity. Building on this, we conduct a comprehensive zero-shot evaluation of state-of-the-art general and legal-domain large language models. Our findings highlight the limitations of the current models in factual precision and expressive clarity, pointing to the need for more targeted development in this domain. To encourage further exploration of this important task, we will make the dataset publicly available.

ClaimGen-CN: A Large-scale Chinese Dataset for Legal Claim Generation

TL;DR

The paper introduces ClaimGen-CN, a large-scale Chinese dataset for legal claim generation derived from 207,748 civil judgments, and defines two evaluation axes—factuality and clarity—to assess generated claims. It formalizes the task as generating a set of claims from the facts and grounds the dataset in real-world civil disputes with broad cause-of-action coverage. A zero-shot evaluation of six leading LLMs shows notable gaps between automatic metrics and human judgments, with model strengths varying across factuality, clarity, and legal grounding. The work highlights key limitations in current models and offers practical guidance for improving non-expert legal support, including future work on long-chain reasoning and human-in-the-loop validation. Overall, ClaimGen-CN provides a crucial resource for advancing accessible legal claim generation and evaluating model performance in a domain with high societal impact.

Abstract

Legal claims refer to the plaintiff's demands in a case and are essential to guiding judicial reasoning and case resolution. While many works have focused on improving the efficiency of legal professionals, the research on helping non-professionals (e.g., plaintiffs) remains unexplored. This paper explores the problem of legal claim generation based on the given case's facts. First, we construct ClaimGen-CN, the first dataset for Chinese legal claim generation task, from various real-world legal disputes. Additionally, we design an evaluation metric tailored for assessing the generated claims, which encompasses two essential dimensions: factuality and clarity. Building on this, we conduct a comprehensive zero-shot evaluation of state-of-the-art general and legal-domain large language models. Our findings highlight the limitations of the current models in factual precision and expressive clarity, pointing to the need for more targeted development in this domain. To encourage further exploration of this important task, we will make the dataset publicly available.

Paper Structure

This paper contains 34 sections, 3 equations, 15 figures, 10 tables.

Figures (15)

  • Figure 1: Conceptual overview of the differences between the pre-court scenario (left part) and the in-court scenario (right). In the pre-court scenario, claims are generated and prepared, which are then addressed and resolved during the in-court scenario. It is important to note that unreasonable claims may lead to a loss.
  • Figure 2: The claim generation of a given case. The purple parts indicate clarity mistakes in prediction, while the pink parts are factual mistakes in prediction. Green checkmarks indicate acceptable outputs, while red crosses mark outputs with factual or clarity errors.
  • Figure 3: Interface used for human annotation.
  • Figure 4: CaseId 1.
  • Figure 5: CaseId 2.
  • ...and 10 more figures