Using LLMs to Discover Legal Factors
Morgan Gray, Jaromir Savelka, Wesley Oliver, Kevin Ashley
TL;DR
The paper tackles the challenge of Automatically discovering factor-based representations for legal reasoning without predefined factor lists. It introduces a semi-automated pipeline that uses LLMs to induce factors from raw court opinions and then refines them into Refined Factor Representations (RFR) with human oversight, evaluating predictive power against canonical Factor Representations (CFR) using MCC. Key contributions include demonstrating the feasibility of scratch-factor induction, showing improved performance with human-in-the-loop refinements, and highlighting a potential novel factor discovered by models, thereby enabling more scalable empirical/legal analyses. The work has practical implications for AI-enabled legal analytics by offering a path to detect new factors and to compare factor lists for predicting case outcomes, though fully automated end-to-end discovery remains challenging.
Abstract
Factors are a foundational component of legal analysis and computational models of legal reasoning. These factor-based representations enable lawyers, judges, and AI and Law researchers to reason about legal cases. In this paper, we introduce a methodology that leverages large language models (LLMs) to discover lists of factors that effectively represent a legal domain. Our method takes as input raw court opinions and produces a set of factors and associated definitions. We demonstrate that a semi-automated approach, incorporating minimal human involvement, produces factor representations that can predict case outcomes with moderate success, if not yet as well as expert-defined factors can.
