Distinguish Confusion in Legal Judgment Prediction via Revised Relation Knowledge

Nuo Xu; Pinghui Wang; Junzhou Zhao; Feiyang Sun; Lin Lan; Jing Tao; Li Pan; Xiaohong Guan

Distinguish Confusion in Legal Judgment Prediction via Revised Relation Knowledge

Nuo Xu, Pinghui Wang, Junzhou Zhao, Feiyang Sun, Lin Lan, Jing Tao, Li Pan, Xiaohong Guan

TL;DR

This work tackles confusion among similar law articles and data-imbalance-induced posterior confusion in Legal Judgment Prediction (LJP). It introduces D-LADAN, an end-to-end framework that combines a Graph Distillation Operator (GDO) over law-article communities with a momentum-updated memory mechanism to revise label relations in light of training dynamics, yielding a three-component fact representation $\tilde{\mathbf{v}}_f=[\mathbf{v}_f^{\text{b}} \oplus \mathbf{v}_f^{\text{p}} \oplus \mathbf{v}_f^{\text{r}}]$ for prediction. The method includes a graph construction layer, graph distillation, revised memories, and distinguishable attention-based re-encoders, with a loss that jointly optimizes prediction and memory updates. Empirical results on CAIL-small/big and Criminal datasets show state-of-the-art performance and improved tail-category accuracy, demonstrating robustness to data imbalance and better generalization. The approach is modular and adaptable to transformer backbones, and is extended with a D-LADAN_BERT variant to leverage token-level representations. Overall, D-LADAN offers a principled way to fuse prior legal knowledge with data-driven revision to enhance LJP reliability and fairness.

Abstract

Legal Judgment Prediction (LJP) aims to automatically predict a law case's judgment results based on the text description of its facts. In practice, the confusing law articles (or charges) problem frequently occurs, reflecting that the law cases applicable to similar articles (or charges) tend to be misjudged. Although some recent works based on prior knowledge solve this issue well, they ignore that confusion also occurs between law articles with a high posterior semantic similarity due to the data imbalance problem instead of only between the prior highly similar ones, which is this work's further finding. This paper proposes an end-to-end model named \textit{D-LADAN} to solve the above challenges. On the one hand, D-LADAN constructs a graph among law articles based on their text definition and proposes a graph distillation operation (GDO) to distinguish the ones with a high prior semantic similarity. On the other hand, D-LADAN presents a novel momentum-updated memory mechanism to dynamically sense the posterior similarity between law articles (or charges) and a weighted GDO to adaptively capture the distinctions for revising the inductive bias caused by the data imbalance problem. We perform extensive experiments to demonstrate that D-LADAN significantly outperforms state-of-the-art methods in accuracy and robustness.

Distinguish Confusion in Legal Judgment Prediction via Revised Relation Knowledge

TL;DR

for prediction. The method includes a graph construction layer, graph distillation, revised memories, and distinguishable attention-based re-encoders, with a loss that jointly optimizes prediction and memory updates. Empirical results on CAIL-small/big and Criminal datasets show state-of-the-art performance and improved tail-category accuracy, demonstrating robustness to data imbalance and better generalization. The approach is modular and adaptable to transformer backbones, and is extended with a D-LADAN_BERT variant to leverage token-level representations. Overall, D-LADAN offers a principled way to fuse prior legal knowledge with data-driven revision to enhance LJP reliability and fairness.

Abstract

Paper Structure (31 sections, 25 equations, 9 figures, 9 tables)

This paper contains 31 sections, 25 equations, 9 figures, 9 tables.

Introduction
Related Work
Legal Judgment Prediction
Graph Neural Networks
Problem Formulation
Our Method
Overview of Framework
Distilling Law Articles
Graph Construction Layer
Graph Distillation Layer
Distilling Revised Memories
Revised Memory
Fully-connected Similarity Graph
Weighted Graph Distillation Layer
Re-encoding Fact with Distinguishable Attention
...and 16 more sections

Figures (9)

Figure 1: Examples of prior confusing charges. We marked similar text with the same pattern, such as bold, italic, red font, pink font, and blue underline.
Figure 2: a. The fact-law attention framework of luo2017learning. b. The attention framework of LADAN, where the inductive bias caused by the data imbalance problem would damage the prior relationship structure. c. Our framework further revises the inductive bias of the data imbalance by constructing a posterior relational structure. Variables $\mathbf{\alpha}$, $\mathbf{\beta}$ and $\mathbf{\gamma}$ represent the context vectors learned from law articles for attentively extracting features from fact descriptions.
Figure 3: The frequency distribution of the law articles and charges on CAIL-small and the corresponding accuracy of LADAN on each category. Note that the IDs of the x-axis have been sorted in descending order of frequency.
Figure 4: Overview of our framework D-LADAN: it takes the fact descriptions of cases and the text definitions of law articles as inputs. Then, it extracts the basic representation $\mathbf{v}_f^{\text{b}}$, the prior distinguishing representation $\mathbf{v}_f^{\text{p}}$, and the revised distinguishing representation $\mathbf{v}_f^{\text{r}}$ of the fact descriptions through the corresponding encoders. Finally, it combines these three representations for the downstream prediction tasks.
Figure 5: Law Distillation Module: this module groups law articles based on the prior similarity relation and distills the distinguishable features of each community for attention calculation of the prior encoder.
...and 4 more figures

Distinguish Confusion in Legal Judgment Prediction via Revised Relation Knowledge

TL;DR

Abstract

Distinguish Confusion in Legal Judgment Prediction via Revised Relation Knowledge

Authors

TL;DR

Abstract

Table of Contents

Figures (9)