Experimenting with Legal AI Solutions: The Case of Question-Answering for Access to Justice

Jonathan Li; Rohan Bhambhoria; Samuel Dahan; Xiaodan Zhu

Experimenting with Legal AI Solutions: The Case of Question-Answering for Access to Justice

Jonathan Li, Rohan Bhambhoria, Samuel Dahan, Xiaodan Zhu

TL;DR

This work introduces and releases a dataset, LegalQA, with real and specific legal questions spanning from employment law to criminal law, corresponding answers written by legal experts, and citations for each answer, and proposes future directions for open-sourced efforts, which fall behind closed-sourced models.

Abstract

Generative AI models, such as the GPT and Llama series, have significant potential to assist laypeople in answering legal questions. However, little prior work focuses on the data sourcing, inference, and evaluation of these models in the context of laypersons. To this end, we propose a human-centric legal NLP pipeline, covering data sourcing, inference, and evaluation. We introduce and release a dataset, LegalQA, with real and specific legal questions spanning from employment law to criminal law, corresponding answers written by legal experts, and citations for each answer. We develop an automatic evaluation protocol for this dataset, then show that retrieval-augmented generation from only 850 citations in the train set can match or outperform internet-wide retrieval, despite containing 9 orders of magnitude less data. Finally, we propose future directions for open-sourced efforts, which fall behind closed-sourced models.

Experimenting with Legal AI Solutions: The Case of Question-Answering for Access to Justice

TL;DR

Abstract

Paper Structure (20 sections, 6 figures, 2 tables)

This paper contains 20 sections, 6 figures, 2 tables.

Introduction
Prior Work
Retrieval Augmented Generation.
Datasets and Legal AI Benchmarks.
Better Data for Better AI.
Automatic Evaluation.
Methods
Retrieval Methods.
Generative Baselines.
Results and Discussion
Open-source models fall behind.
Using limited legal documents is just as useful as the entire internet.
GPT-4 outperforms retrieval.
Some categories of questions are more difficult to answer accurately.
In what situations does legal retrieval help?
...and 5 more sections

Figures (6)

Figure 1: An overview of our framework for human-centric legal AI.
Figure 2: Distribution of question lengths and response lengths. Responses are concise and specific.
Figure 3: Retrieval-based methods used for our experiments. Given a legal question, retrieval is performed to generate a relevant answer.
Figure 4: Factual disagreement of each model by category. Lower is better.
Figure 5: Factual disagreement for each model. "GPT-3.5 Legal" is retrieval using only legal documents, and "GPT-3.5 Internet" is retrieval from the entire internet.
...and 1 more figures

Experimenting with Legal AI Solutions: The Case of Question-Answering for Access to Justice

TL;DR

Abstract

Experimenting with Legal AI Solutions: The Case of Question-Answering for Access to Justice

Authors

TL;DR

Abstract

Table of Contents

Figures (6)