Promises and pitfalls of artificial intelligence for legal applications
Sayash Kapoor, Peter Henderson, Arvind Narayanan
TL;DR
This paper analyzes the potential and limits of AI in legal applications across information processing, creativity/reasoning, and predictive tasks. It argues that the evidence does not support a legal AI revolution and emphasizes significant evaluation challenges, such as contamination, construct validity, and prompt sensitivity. The authors propose governance-oriented recommendations, including involving legal experts in evaluating AI (e.g., LegalBench), pursuing naturalistic and task-specific evaluations, and restricting deployment to narrow, high-observability settings with strong transparency. They stress that predictive AI in law demands higher standards, transparency, and contestability to prevent harmful societal impacts. Overall, the work advocates robust socio-technical assessments to ensure safe, evidence-based adoption of AI in legal contexts.
Abstract
Is AI set to redefine the legal profession? We argue that this claim is not supported by the current evidence. We dive into AI's increasingly prevalent roles in three types of legal tasks: information processing; tasks involving creativity, reasoning, or judgment; and predictions about the future. We find that the ease of evaluating legal applications varies greatly across legal tasks, based on the ease of identifying correct answers and the observability of information relevant to the task at hand. Tasks that would lead to the most significant changes to the legal profession are also the ones most prone to overoptimism about AI capabilities, as they are harder to evaluate. We make recommendations for better evaluation and deployment of AI in legal contexts.
