Table of Contents
Fetching ...

VERDICT: Verifiable Evolving Reasoning with Directive-Informed Collegial Teams for Legal Judgment Prediction

Hui Liao, Chuan Qin, Yongwen Ren, Hao Li, Zhenya Huang, Yanyong Zhang, Chao Wang

Abstract

Legal Judgment Prediction (LJP) predicts applicable law articles, charges, and penalty terms from case facts. Beyond accuracy, LJP calls for intrinsically interpretable and legally grounded reasoning that can reconcile statutory rules with precedent-informed standards. However, existing methods often behave as static, one-shot predictors, providing limited procedural support for verifiable reasoning and little capability to adapt as jurisprudential practice evolves. We propose VERDICT, a self-refining collaborative multi-agent framework that simulates a virtual collegial panel. VERDICT assigns specialized agents to complementary roles (e.g., fact structuring, legal retrieval, opinion drafting, and supervisory verification) and coordinates them in a traceable draft--verify--revise workflow with explicit Pass/Reject feedback, producing verifiable reasoning traces and revision rationales. To capture evolving case experience, we further introduce a Hybrid Jurisprudential Memory (HJM) grounded in the Micro-Directive Paradigm, which stores precedent standards and continually distills validated multi-agent verification trajectories into updated Micro-Directives for continual learning across cases. We evaluate VERDICT on CAIL2018 and a newly constructed CJO2025 dataset with a strict future time-split for temporal generalization. VERDICT achieves state-of-the-art performance on CAIL2018 and demonstrates strong generalization on CJO2025. To facilitate reproducibility and further research, we release our code and the dataset at https://anonymous.4open.science/r/ARR-4437.

VERDICT: Verifiable Evolving Reasoning with Directive-Informed Collegial Teams for Legal Judgment Prediction

Abstract

Legal Judgment Prediction (LJP) predicts applicable law articles, charges, and penalty terms from case facts. Beyond accuracy, LJP calls for intrinsically interpretable and legally grounded reasoning that can reconcile statutory rules with precedent-informed standards. However, existing methods often behave as static, one-shot predictors, providing limited procedural support for verifiable reasoning and little capability to adapt as jurisprudential practice evolves. We propose VERDICT, a self-refining collaborative multi-agent framework that simulates a virtual collegial panel. VERDICT assigns specialized agents to complementary roles (e.g., fact structuring, legal retrieval, opinion drafting, and supervisory verification) and coordinates them in a traceable draft--verify--revise workflow with explicit Pass/Reject feedback, producing verifiable reasoning traces and revision rationales. To capture evolving case experience, we further introduce a Hybrid Jurisprudential Memory (HJM) grounded in the Micro-Directive Paradigm, which stores precedent standards and continually distills validated multi-agent verification trajectories into updated Micro-Directives for continual learning across cases. We evaluate VERDICT on CAIL2018 and a newly constructed CJO2025 dataset with a strict future time-split for temporal generalization. VERDICT achieves state-of-the-art performance on CAIL2018 and demonstrates strong generalization on CJO2025. To facilitate reproducibility and further research, we release our code and the dataset at https://anonymous.4open.science/r/ARR-4437.
Paper Structure (38 sections, 11 equations, 8 figures, 4 tables)

This paper contains 38 sections, 11 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: The overall inference framework of VERDICT. It illustrates the interaction between the Traceable Multi-Agent Workflow and the Hybrid Jurisprudential Memory (HJM). The process evolves through Preparation, Drafting & Review, and Final Adjudication phases. Note that the Case Judge agent is instantiated using our domain-specific aligned expert model.
  • Figure 2: The evolutionary mechanism of the Hybrid Jurisprudential Memory. The lifecycle strategy $\Phi_{trans}$ transforms archived empirical Standards into precise and compact Micro-Directives through three phases.
  • Figure 3: Case study comparison. The red highlights indicate misleading surface features (resulting in Intentional Injury), while the green highlights denote contextual evidence supporting Picking Quarrels. Vanilla falls into the keyword trap; PLJP fails to resolve the statutory conflict; VERDICT correctly identifies the crime's nature via the Supervisor's logical rectification.
  • Figure 4: System prompt for the Court Clerk Agent, responsible for distilling objective event points from raw facts.
  • Figure 5: System prompt for the Judicial Assistant Agent, performing semantic re-ranking of precedents and statutes.
  • ...and 3 more figures