Table of Contents
Fetching ...

Using LLMs to Discover Legal Factors

Morgan Gray, Jaromir Savelka, Wesley Oliver, Kevin Ashley

TL;DR

The paper tackles the challenge of Automatically discovering factor-based representations for legal reasoning without predefined factor lists. It introduces a semi-automated pipeline that uses LLMs to induce factors from raw court opinions and then refines them into Refined Factor Representations (RFR) with human oversight, evaluating predictive power against canonical Factor Representations (CFR) using MCC. Key contributions include demonstrating the feasibility of scratch-factor induction, showing improved performance with human-in-the-loop refinements, and highlighting a potential novel factor discovered by models, thereby enabling more scalable empirical/legal analyses. The work has practical implications for AI-enabled legal analytics by offering a path to detect new factors and to compare factor lists for predicting case outcomes, though fully automated end-to-end discovery remains challenging.

Abstract

Factors are a foundational component of legal analysis and computational models of legal reasoning. These factor-based representations enable lawyers, judges, and AI and Law researchers to reason about legal cases. In this paper, we introduce a methodology that leverages large language models (LLMs) to discover lists of factors that effectively represent a legal domain. Our method takes as input raw court opinions and produces a set of factors and associated definitions. We demonstrate that a semi-automated approach, incorporating minimal human involvement, produces factor representations that can predict case outcomes with moderate success, if not yet as well as expert-defined factors can.

Using LLMs to Discover Legal Factors

TL;DR

The paper tackles the challenge of Automatically discovering factor-based representations for legal reasoning without predefined factor lists. It introduces a semi-automated pipeline that uses LLMs to induce factors from raw court opinions and then refines them into Refined Factor Representations (RFR) with human oversight, evaluating predictive power against canonical Factor Representations (CFR) using MCC. Key contributions include demonstrating the feasibility of scratch-factor induction, showing improved performance with human-in-the-loop refinements, and highlighting a potential novel factor discovered by models, thereby enabling more scalable empirical/legal analyses. The work has practical implications for AI-enabled legal analytics by offering a path to detect new factors and to compare factor lists for predicting case outcomes, though fully automated end-to-end discovery remains challenging.

Abstract

Factors are a foundational component of legal analysis and computational models of legal reasoning. These factor-based representations enable lawyers, judges, and AI and Law researchers to reason about legal cases. In this paper, we introduce a methodology that leverages large language models (LLMs) to discover lists of factors that effectively represent a legal domain. Our method takes as input raw court opinions and produces a set of factors and associated definitions. We demonstrate that a semi-automated approach, incorporating minimal human involvement, produces factor representations that can predict case outcomes with moderate success, if not yet as well as expert-defined factors can.

Paper Structure

This paper contains 10 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Similarity between Human Refined Factor Representation (left) and DIAS Canonical Factor Representation (right).
  • Figure 2: Similarity between Llama Refined Factor Representation (left) and DIAS Canonical Factor Representation (right).