Place Matters: Comparing LLM Hallucination Rates for Place-Based Legal Queries
Damian Curran, Vanessa Sporne, Lea Frermann, Jeannie Paterson
TL;DR
This study asks whether LLMs’ knowledge of law varies by geographic location and how that affects the quality of legal information provided to users. It introduces a functionalist comparative-law framework that treats practical legal problems as the basis for cross-place evaluation, using 100 Reddit-derived, place-agnostic scenarios evaluated in Los Angeles, London, and Sydney across three LLMs. Manual annotation of LLM outputs against actual laws yields metrics for hallucinations, $hr$ and $hr^*$, revealing significant place-based differences and a strong negative correlation between majority-sample frequency and hallucination rates, suggesting a practical uncertainty signal. The work highlights implications for equitable access to justice and underscores the need for jurisdiction-aware validation of AI-assisted legal tools.
Abstract
How do we make a meaningful comparison of a large language model's knowledge of the law in one place compared to another? Quantifying these differences is critical to understanding if the quality of the legal information obtained by users of LLM-based chatbots varies depending on their location. However, obtaining meaningful comparative metrics is challenging because legal institutions in different places are not themselves easily comparable. In this work we propose a methodology to obtain place-to-place metrics based on the comparative law concept of functionalism. We construct a dataset of factual scenarios drawn from Reddit posts by users seeking legal advice for family, housing, employment, crime and traffic issues. We use these to elicit a summary of a law from the LLM relevant to each scenario in Los Angeles, London and Sydney. These summaries, typically of a legislative provision, are manually evaluated for hallucinations. We show that the rate of hallucination of legal information by leading closed-source LLMs is significantly associated with place. This suggests that the quality of legal solutions provided by these models is not evenly distributed across geography. Additionally, we show a strong negative correlation between hallucination rate and the frequency of the majority response when the LLM is sampled multiple times, suggesting a measure of uncertainty of model predictions of legal facts.
