Table of Contents
Fetching ...

Neural Legal Judgment Prediction in English

Ilias Chalkidis, Ion Androutsopoulos, Nikolaos Aletras

TL;DR

This work introduces a publicly available English legal judgment prediction dataset derived from the European Court of Human Rights, containing ~11.5k cases with raw text and 66 potential violation labels across three tasks. It systematically evaluates multiple neural architectures, including BiGRU-Att, HAN, LWAN, BERT, and a novel Hierarchical BERT, showing neural models outperform traditional feature-based baselines and addressing long-document processing. The study also investigates demographic bias via data anonymization and demonstrates that Hierarchical BERT achieves state-of-the-art performance by effectively handling long case texts. Collectively, the paper provides new resources and insights for applying neural methods to English legal text, with implications for fairness, interpretability, and few-shot learning in legal judgment prediction.

Abstract

Legal judgment prediction is the task of automatically predicting the outcome of a court case, given a text describing the case's facts. Previous work on using neural models for this task has focused on Chinese; only feature-based models (e.g., using bags of words and topics) have been considered in English. We release a new English legal judgment prediction dataset, containing cases from the European Court of Human Rights. We evaluate a broad variety of neural models on the new dataset, establishing strong baselines that surpass previous feature-based models in three tasks: (1) binary violation classification; (2) multi-label classification; (3) case importance prediction. We also explore if models are biased towards demographic information via data anonymization. As a side-product, we propose a hierarchical version of BERT, which bypasses BERT's length limitation.

Neural Legal Judgment Prediction in English

TL;DR

This work introduces a publicly available English legal judgment prediction dataset derived from the European Court of Human Rights, containing ~11.5k cases with raw text and 66 potential violation labels across three tasks. It systematically evaluates multiple neural architectures, including BiGRU-Att, HAN, LWAN, BERT, and a novel Hierarchical BERT, showing neural models outperform traditional feature-based baselines and addressing long-document processing. The study also investigates demographic bias via data anonymization and demonstrates that Hierarchical BERT achieves state-of-the-art performance by effectively handling long case texts. Collectively, the paper provides new resources and insights for applying neural methods to English legal text, with implications for fairness, interpretability, and few-shot learning in legal judgment prediction.

Abstract

Legal judgment prediction is the task of automatically predicting the outcome of a court case, given a text describing the case's facts. Previous work on using neural models for this task has focused on Chinese; only feature-based models (e.g., using bags of words and topics) have been considered in English. We release a new English legal judgment prediction dataset, containing cases from the European Court of Human Rights. We evaluate a broad variety of neural models on the new dataset, establishing strong baselines that surpass previous feature-based models in three tasks: (1) binary violation classification; (2) multi-label classification; (3) case importance prediction. We also explore if models are biased towards demographic information via data anonymization. As a side-product, we propose a hierarchical version of BERT, which bypasses BERT's length limitation.

Paper Structure

This paper contains 22 sections, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Attention over words (colored words) and facts (vertical heat bars) as produced by han.