SEMDR: A Semantic-Aware Dual Encoder Model for Legal Judgment Prediction with Legal Clue Tracing

Pengjie Liu; Wang Zhang; Yulong Ding; Xuefeng Zhang; Shuang-Hua Yang

SEMDR: A Semantic-Aware Dual Encoder Model for Legal Judgment Prediction with Legal Clue Tracing

Pengjie Liu, Wang Zhang, Yulong Ding, Xuefeng Zhang, Shuang-Hua Yang

TL;DR

The paper tackles Legal Judgment Prediction (LJP), focusing on accurately distinguishing between confusable criminal charges. It introduces SEMDR, a semantic-aware dual encoder with a three-level legal clue tracing mechanism—Lexicon-Tracing, Sentence Representation Learning (contrastive training with dropout), and Multi-Fact Reasoning via a case-enhancement graph—to enable fine-grained reasoning between criminal facts and instrument labels. The model learns robust criminal-fact representations and propagates clues through a graph attention network to refine instrument-label embeddings, with the prediction probability defined as $P(L_{I/C/A}|H^{F}) = \mathrm{softmax}(\mathrm{sim}(H^{F}, \widetilde{H}^{L}))$. Empirical results on CAIL2018 show SEMDR achieving state-of-the-art performance, especially in low-frequency and confusing charges, with ablations confirming the dominant contribution of graph reasoning and the synergistic effect of all three clue-tracing components. The work advances LJP by reducing uncertainty and enabling more uniform, discriminative representations, with practical impact for robust legal judgments and potential few-shot learning benefits.

Abstract

Legal Judgment Prediction (LJP) aims to form legal judgments based on the criminal fact description. However, researchers struggle to classify confusing criminal cases, such as robbery and theft, which requires LJP models to distinguish the nuances between similar crimes. Existing methods usually design handcrafted features to pick up necessary semantic legal clues to make more accurate legal judgment predictions. In this paper, we propose a Semantic-Aware Dual Encoder Model (SEMDR), which designs a novel legal clue tracing mechanism to conduct fine-grained semantic reasoning between criminal facts and instruments. Our legal clue tracing mechanism is built from three reasoning levels: 1) Lexicon-Tracing, which aims to extract criminal facts from criminal descriptions; 2) Sentence Representation Learning, which contrastively trains language models to better represent confusing criminal facts; 3) Multi-Fact Reasoning, which builds a reasons graph to propagate semantic clues among fact nodes to capture the subtle difference among criminal facts. Our legal clue tracing mechanism helps SEMDR achieve state-of-the-art on the CAIL2018 dataset and shows its advance in few-shot scenarios. Our experiments show that SEMDR has a strong ability to learn more uniform and distinguished representations for criminal facts, which helps to make more accurate predictions on confusing criminal cases and reduces the model uncertainty during making judgments. All codes will be released via GitHub.

SEMDR: A Semantic-Aware Dual Encoder Model for Legal Judgment Prediction with Legal Clue Tracing

TL;DR

. Empirical results on CAIL2018 show SEMDR achieving state-of-the-art performance, especially in low-frequency and confusing charges, with ablations confirming the dominant contribution of graph reasoning and the synergistic effect of all three clue-tracing components. The work advances LJP by reducing uncertainty and enabling more uniform, discriminative representations, with practical impact for robust legal judgments and potential few-shot learning benefits.

Abstract

Paper Structure (18 sections, 8 equations, 5 figures, 4 tables)

This paper contains 18 sections, 8 equations, 5 figures, 4 tables.

Introduction
Related Work
Methodology
Preliminary of Legal Judgment Prediction
Clue-Aware Criminal Case Representation
Multi-Evidence Enhancement Instrument Label Representation
Experimental Methodology
Datasets
Baselines
Testing Scenarios
Implementation Details
Evaluation Results
Overall Performance
Assessing the Performance of Legal Judgment Prediction Across Diverse Testing Scenarios
Ablation Study
...and 3 more sections

Figures (5)

Figure 1: An Example of the Legal Clue Tracing Mechanism in SEMDR. We build a finer-grained reasoning framework for legal judgment prediction. The strikethrough part in the fact description denotes the non-clue description, which will be filtered out during lexicon-tracing module.
Figure 2: The Architecture of SEMDR Framework. In the legal judgment reasoning graph, we have defined several types of nodes ${(N)}$ and relations ${(R)}$ between criminal cases and corresponding judgment instruments.
Figure 3: Embedding Visualization (t-SNE) of Different Criminal Facts. The $"\;\;"$ in different colours denotes the charges embedding of < robbery, theft, fraud> respectively and the ${"\bullet"}$ represents the fact embedding of their criminal cases. Figure \ref{['fig:4a']} to \ref{['fig:4b']} demonstrates the distribution of criminal cases and legal charges after different modules in SEMDR.
Figure 4: An Easily Misjudged Example Case Predicted by BERT (w/o Graph Reasoning) Model and SMEDR.
Figure 5: The Contributions of Legal Clue Tracing Mechanism to Assist SEMDR in Confusing Charge Prediction. A darker color means higher attention score and association.

SEMDR: A Semantic-Aware Dual Encoder Model for Legal Judgment Prediction with Legal Clue Tracing

TL;DR

Abstract

SEMDR: A Semantic-Aware Dual Encoder Model for Legal Judgment Prediction with Legal Clue Tracing

Authors

TL;DR

Abstract

Table of Contents

Figures (5)