Table of Contents
Fetching ...

Legal Prompt Engineering for Multilingual Legal Judgement Prediction

Dietrich Trautmann, Alina Petrova, Frank Schilder

TL;DR

This work explores zero-shot Legal Prompt Engineering (LPE) for Legal Judgement Prediction (LJP) on long legal texts from the ECHR and FSCS. By using multilingual LLMs without domain-specific data or fine-tuning, it demonstrates that carefully crafted prompts can outperform basic baselines but still fall short of supervised state-of-the-art performance. The study highlights the transferability of general-purpose LLMs to the legal domain and underscores potential cost benefits, while outlining pathways for improvement through SME involvement and broader task coverage. Overall, LPE shows promise for accessible legal NLP tasks, especially where labeled data or training resources are limited.

Abstract

Legal Prompt Engineering (LPE) or Legal Prompting is a process to guide and assist a large language model (LLM) with performing a natural legal language processing (NLLP) skill. Our goal is to use LPE with LLMs over long legal documents for the Legal Judgement Prediction (LJP) task. We investigate the performance of zero-shot LPE for given facts in case-texts from the European Court of Human Rights (in English) and the Federal Supreme Court of Switzerland (in German, French and Italian). Our results show that zero-shot LPE is better compared to the baselines, but it still falls short compared to current state of the art supervised approaches. Nevertheless, the results are important, since there was 1) no explicit domain-specific data used - so we show that the transfer to the legal domain is possible for general-purpose LLMs, and 2) the LLMs where directly applied without any further training or fine-tuning - which in turn saves immensely in terms of additional computational costs.

Legal Prompt Engineering for Multilingual Legal Judgement Prediction

TL;DR

This work explores zero-shot Legal Prompt Engineering (LPE) for Legal Judgement Prediction (LJP) on long legal texts from the ECHR and FSCS. By using multilingual LLMs without domain-specific data or fine-tuning, it demonstrates that carefully crafted prompts can outperform basic baselines but still fall short of supervised state-of-the-art performance. The study highlights the transferability of general-purpose LLMs to the legal domain and underscores potential cost benefits, while outlining pathways for improvement through SME involvement and broader task coverage. Overall, LPE shows promise for accessible legal NLP tasks, especially where labeled data or training resources are limited.

Abstract

Legal Prompt Engineering (LPE) or Legal Prompting is a process to guide and assist a large language model (LLM) with performing a natural legal language processing (NLLP) skill. Our goal is to use LPE with LLMs over long legal documents for the Legal Judgement Prediction (LJP) task. We investigate the performance of zero-shot LPE for given facts in case-texts from the European Court of Human Rights (in English) and the Federal Supreme Court of Switzerland (in German, French and Italian). Our results show that zero-shot LPE is better compared to the baselines, but it still falls short compared to current state of the art supervised approaches. Nevertheless, the results are important, since there was 1) no explicit domain-specific data used - so we show that the transfer to the legal domain is possible for general-purpose LLMs, and 2) the LLMs where directly applied without any further training or fine-tuning - which in turn saves immensely in terms of additional computational costs.
Paper Structure (18 sections, 3 figures, 2 tables)

This paper contains 18 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Our Legal Prompting stack.
  • Figure 2: An example prompt template in English for the ECHR task.
  • Figure 3: Completion examples from the GPT-J-6B model on the test set of the ECHR corpus.