Table of Contents
Fetching ...

Toward Robust Legal Text Formalization into Defeasible Deontic Logic using LLMs

Elias Horner, Cristinel Mateis, Guido Governatori, Agata Ciabattoni

TL;DR

This work shows that large language models can be effectively guided to translate complex legal language into Defeasible Deontic Logic (DDL) via a structured pipeline that segments text into atomic snippets and enforces a rigorous evaluation of completeness and correctness. The authors systematically compare prompt-based single-stage approaches, multi-snippet strategies, fine-tuning, and two-stage pipelines, finding that carefully crafted Chain-of-Instructions prompts on individual law snippets yield the best performance, with a refinement step improving cross-snippet coherence. They validate their approach on Australian TCP Code norms, outperforming a key prior method under recalibrated metrics and demonstrating high deontic accuracy. The study underscores the potential for scalable legal informatics while identifying key limitations in evaluation metrics, cross-referencing, and interpretation, and outlines concrete future work in active learning, multilingual formalization, and end-user tooling.

Abstract

We present a comprehensive approach to the automated formalization of legal texts using large language models (LLMs), targeting their transformation into Defeasible Deontic Logic (DDL). Our method employs a structured pipeline that segments complex normative language into atomic snippets, extracts deontic rules, and evaluates them for syntactic and semantic coherence. We introduce a refined success metric that more precisely captures the completeness of formalizations, and a novel two-stage pipeline with a dedicated refinement step to improve logical consistency and coverage. The evaluation procedure has been strengthened with stricter error assessment, and we provide comparative results across multiple LLM configurations, including newly released models and various prompting and fine-tuning strategies. Experiments on legal norms from the Australian Telecommunications Consumer Protections Code demonstrate that, when guided effectively, LLMs can produce formalizations that align closely with expert-crafted representations, underscoring their potential for scalable legal informatics.

Toward Robust Legal Text Formalization into Defeasible Deontic Logic using LLMs

TL;DR

This work shows that large language models can be effectively guided to translate complex legal language into Defeasible Deontic Logic (DDL) via a structured pipeline that segments text into atomic snippets and enforces a rigorous evaluation of completeness and correctness. The authors systematically compare prompt-based single-stage approaches, multi-snippet strategies, fine-tuning, and two-stage pipelines, finding that carefully crafted Chain-of-Instructions prompts on individual law snippets yield the best performance, with a refinement step improving cross-snippet coherence. They validate their approach on Australian TCP Code norms, outperforming a key prior method under recalibrated metrics and demonstrating high deontic accuracy. The study underscores the potential for scalable legal informatics while identifying key limitations in evaluation metrics, cross-referencing, and interpretation, and outlines concrete future work in active learning, multilingual formalization, and end-user tooling.

Abstract

We present a comprehensive approach to the automated formalization of legal texts using large language models (LLMs), targeting their transformation into Defeasible Deontic Logic (DDL). Our method employs a structured pipeline that segments complex normative language into atomic snippets, extracts deontic rules, and evaluates them for syntactic and semantic coherence. We introduce a refined success metric that more precisely captures the completeness of formalizations, and a novel two-stage pipeline with a dedicated refinement step to improve logical consistency and coverage. The evaluation procedure has been strengthened with stricter error assessment, and we provide comparative results across multiple LLM configurations, including newly released models and various prompting and fine-tuning strategies. Experiments on legal norms from the Australian Telecommunications Consumer Protections Code demonstrate that, when guided effectively, LLMs can produce formalizations that align closely with expert-crafted representations, underscoring their potential for scalable legal informatics.

Paper Structure

This paper contains 26 sections, 6 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Success scores of various LLMs
  • Figure 2: Success scores of all LLMs (perfect formalizations only)
  • Figure 3: Success scores when formalizing all law snippets at once
  • Figure 4: Success scores when formalizing all law snippets at once (perfect formalizations only)
  • Figure 5: Success scores after fine-tuning
  • ...and 5 more figures