Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling

Hang Jiang; Xiajie Zhang; Robert Mahari; Daniel Kessler; Eric Ma; Tal August; Irene Li; Alex 'Sandy' Pentland; Yoon Kim; Deb Roy; Jad Kabbara

Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling

Hang Jiang, Xiajie Zhang, Robert Mahari, Daniel Kessler, Eric Ma, Tal August, Irene Li, Alex 'Sandy' Pentland, Yoon Kim, Deb Roy, Jad Kabbara

TL;DR

This work tackles the accessibility of complex legal concepts for non-experts by leveraging large language models to generate explanatory stories and assessment questions. Through an expert-in-the-loop pipeline, the authors create LegalStories, a dataset pairing doctrines with definitions, stories, and MCQs, and validate the approach with human evaluations and an RCT across native and non-native English speakers. Across models, GPT-4 delivers the most coherent and faithful stories and questions, yet expert critique remains essential to curb errors. The RCT shows that storytelling improves comprehension, relevance, and notably retention for non-native learners, signaling strong potential for LLM-driven legal education while underscoring the need for careful design and oversight to manage risks and preserve nuance.

Abstract

Making legal knowledge accessible to non-experts is crucial for enhancing general legal literacy and encouraging civic participation in democracy. However, legal documents are often challenging to understand for people without legal backgrounds. In this paper, we present a novel application of large language models (LLMs) in legal education to help non-experts learn intricate legal concepts through storytelling, an effective pedagogical tool in conveying complex and abstract concepts. We also introduce a new dataset LegalStories, which consists of 294 complex legal doctrines, each accompanied by a story and a set of multiple-choice questions generated by LLMs. To construct the dataset, we experiment with various LLMs to generate legal stories explaining these concepts. Furthermore, we use an expert-in-the-loop approach to iteratively design multiple-choice questions. Then, we evaluate the effectiveness of storytelling with LLMs through randomized controlled trials (RCTs) with legal novices on 10 samples from the dataset. We find that LLM-generated stories enhance comprehension of legal concepts and interest in law among non-native speakers compared to only definitions. Moreover, stories consistently help participants relate legal concepts to their lives. Finally, we find that learning with stories shows a higher retention rate for non-native speakers in the follow-up assessment. Our work has strong implications for using LLMs in promoting teaching and learning in the legal field and beyond.

Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling

TL;DR

Abstract

Paper Structure (72 sections, 29 figures, 8 tables)

This paper contains 72 sections, 29 figures, 8 tables.

Introduction
Related Work
Legal NLP & Accessible Language
Storytelling in Education and NLP
Educational Question Generation
LegalStories Dataset
Doctrine Definitions from Wikipedia
Story and Question Generation
Story Generation
Question Generation
Question Refinement with Expertise
Evaluation
Story Evaluation
Human Evaluation
Results
...and 57 more sections

Figures (29)

Figure 1: Illustration of the expert-in-the-loop pipeline. The left section demonstrates the procedure to produce an LLM-generated story from the concept. The lower section in the center shows how we use both the definition and story as input to produce LLM-generated reading comprehension (RC) questions. The center upper section shows that we first collect expert feedback on questions and regenerate questions with expert advice. The right section outlines the RCT experiment to see if LLM-generated stories improve comprehension in legal concepts.
Figure 2: Distribution of questions with or without issues generated by LLaMA 2, GPT-3.5, and GPT-4.
Figure 3: Distribution of different issues among the questions generated by LLaMA 2, GPT-3.5, and GPT-4.
Figure 4: Consent form on Prolific.
Figure 5: Concept definition example.
...and 24 more figures

Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling

TL;DR

Abstract

Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling

Authors

TL;DR

Abstract

Table of Contents

Figures (29)