Table of Contents
Fetching ...

On Verifiable Legal Reasoning: A Multi-Agent Framework with Formalized Knowledge Representations

Albert Sadowski, Jarosław A. Chudziak

TL;DR

This paper tackles the problem that AI legal reasoning often trades off accuracy for transparency when using end-to-end models. It proposes SOLAR, a two-stage neuro-symbolic framework that first constructs a formal ontology (TBox) from statutes and then applies it to cases via a symbolic engine operating on ABox facts, yielding verifiable, auditable conclusions. Empirical results on the SARA numeric tax dataset show substantial improvements for foundational models (up to 76.4% average accuracy) and a dramatic narrowing of the gap to dedicated reasoning systems, alongside explicit inspection points throughout the process. The findings suggest that modular, ontology-based reasoning can democratize sophisticated legal analysis by delivering reliable, interpretable results with favorable computational characteristics, with promising prospects for broader regulatory domains.

Abstract

Legal reasoning requires both precise interpretation of statutory language and consistent application of complex rules, presenting significant challenges for AI systems. This paper introduces a modular multi-agent framework that decomposes legal reasoning into distinct knowledge acquisition and application stages. In the first stage, specialized agents extract legal concepts and formalize rules to create verifiable intermediate representations of statutes. The second stage applies this knowledge to specific cases through three steps: analyzing queries to map case facts onto the ontology schema, performing symbolic inference to derive logically entailed conclusions, and generating final answers using a programmatic implementation that operationalizes the ontological knowledge. This bridging of natural language understanding with symbolic reasoning provides explicit and verifiable inspection points, significantly enhancing transparency compared to end-to-end approaches. Evaluation on statutory tax calculation tasks demonstrates substantial improvements, with foundational models achieving 76.4\% accuracy compared to 18.8\% baseline performance, effectively narrowing the performance gap between reasoning and foundational models. These findings suggest that modular architectures with formalized knowledge representations can make sophisticated legal reasoning more accessible through computationally efficient models while enhancing consistency and explainability in AI legal reasoning, establishing a foundation for future research into more transparent, trustworthy, and effective AI systems for legal domain.

On Verifiable Legal Reasoning: A Multi-Agent Framework with Formalized Knowledge Representations

TL;DR

This paper tackles the problem that AI legal reasoning often trades off accuracy for transparency when using end-to-end models. It proposes SOLAR, a two-stage neuro-symbolic framework that first constructs a formal ontology (TBox) from statutes and then applies it to cases via a symbolic engine operating on ABox facts, yielding verifiable, auditable conclusions. Empirical results on the SARA numeric tax dataset show substantial improvements for foundational models (up to 76.4% average accuracy) and a dramatic narrowing of the gap to dedicated reasoning systems, alongside explicit inspection points throughout the process. The findings suggest that modular, ontology-based reasoning can democratize sophisticated legal analysis by delivering reliable, interpretable results with favorable computational characteristics, with promising prospects for broader regulatory domains.

Abstract

Legal reasoning requires both precise interpretation of statutory language and consistent application of complex rules, presenting significant challenges for AI systems. This paper introduces a modular multi-agent framework that decomposes legal reasoning into distinct knowledge acquisition and application stages. In the first stage, specialized agents extract legal concepts and formalize rules to create verifiable intermediate representations of statutes. The second stage applies this knowledge to specific cases through three steps: analyzing queries to map case facts onto the ontology schema, performing symbolic inference to derive logically entailed conclusions, and generating final answers using a programmatic implementation that operationalizes the ontological knowledge. This bridging of natural language understanding with symbolic reasoning provides explicit and verifiable inspection points, significantly enhancing transparency compared to end-to-end approaches. Evaluation on statutory tax calculation tasks demonstrates substantial improvements, with foundational models achieving 76.4\% accuracy compared to 18.8\% baseline performance, effectively narrowing the performance gap between reasoning and foundational models. These findings suggest that modular architectures with formalized knowledge representations can make sophisticated legal reasoning more accessible through computationally efficient models while enhancing consistency and explainability in AI legal reasoning, establishing a foundation for future research into more transparent, trustworthy, and effective AI systems for legal domain.

Paper Structure

This paper contains 21 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: Knowledge Acquisition. The multi-agent pipeline transforms legal statute text into a formal $TBox$ and executable interpreter through parallel concept extraction and rule formulation, followed by integration, validation, and iterative refinement based on training examples.
  • Figure 2: Knowledge Application. The system processes user queries by mapping case facts to the ontology schema, performing symbolic inference on the resulting $ABox$, and generating answers through the pre-computed $TBox$ interpreter.
  • Figure 3: Performance comparison across model types on statutory reasoning tasks (SARA numeric dataset, 10% tolerance). Zero-shot and Chain-of-Code approaches show a 68.2 and 37.1 percentage-point gap between reasoning and non-reasoning models. SOLAR framework reduces this gap. Error bars show min-max range across models within each category. Takeaway: Structured ontological reasoning enables foundational models to achieve near-reasoning-model performance on complex legal tasks.