Table of Contents
Fetching ...

Objection Overruled! Lay People can Distinguish Large Language Models from Lawyers, but still Favour Advice from an LLM

Eike Schneiders, Tina Seabrooke, Joshua Krook, Richard Hyde, Natalie Leesakul, Jeremie Clos, Joel Fischer

TL;DR

This study probes how laypeople respond to AI-generated versus lawyer-generated legal advice, focusing on two aspects: willingness to act on such advice when the source is unknown and the ability to distinguish the source afterward. Across three experiments (N total = 288), the authors find that informants are more willing to act on LLM-generated advice when the advice source is not disclosed, a result robust across experiments 1 and 2, and they demonstrate above-chance discrimination of sources in Experiment 3 (AUC = $0.59$). The discussion links these effects to language complexity, potential overtrust in AI, and policy implications, including transparency and AI literacy strategies. The work underscores the practical risk that non-experts may overvalue AI-generated legal guidance when the source is opaque, even as they retain partial ability to identify AI authorship. Overall, the findings motivate caution in deploying LLM-based legal assistance and point to avenues for improving transparency and user education.

Abstract

Large Language Models (LLMs) are seemingly infiltrating every domain, and the legal context is no exception. In this paper, we present the results of three experiments (total N = 288) that investigated lay people's willingness to act upon, and their ability to discriminate between, LLM- and lawyer-generated legal advice. In Experiment 1, participants judged their willingness to act on legal advice when the source of the advice was either known or unknown. When the advice source was unknown, participants indicated that they were significantly more willing to act on the LLM-generated advice. The result of the source unknown condition was replicated in Experiment 2. Intriguingly, despite participants indicating higher willingness to act on LLM-generated advice in Experiments 1 and 2, participants discriminated between the LLM- and lawyer-generated texts significantly above chance-level in Experiment 3. Lastly, we discuss potential explanations and risks of our findings, limitations and future work.

Objection Overruled! Lay People can Distinguish Large Language Models from Lawyers, but still Favour Advice from an LLM

TL;DR

This study probes how laypeople respond to AI-generated versus lawyer-generated legal advice, focusing on two aspects: willingness to act on such advice when the source is unknown and the ability to distinguish the source afterward. Across three experiments (N total = 288), the authors find that informants are more willing to act on LLM-generated advice when the advice source is not disclosed, a result robust across experiments 1 and 2, and they demonstrate above-chance discrimination of sources in Experiment 3 (AUC = ). The discussion links these effects to language complexity, potential overtrust in AI, and policy implications, including transparency and AI literacy strategies. The work underscores the practical risk that non-experts may overvalue AI-generated legal guidance when the source is opaque, even as they retain partial ability to identify AI authorship. Overall, the findings motivate caution in deploying LLM-based legal assistance and point to avenues for improving transparency and user education.

Abstract

Large Language Models (LLMs) are seemingly infiltrating every domain, and the legal context is no exception. In this paper, we present the results of three experiments (total N = 288) that investigated lay people's willingness to act upon, and their ability to discriminate between, LLM- and lawyer-generated legal advice. In Experiment 1, participants judged their willingness to act on legal advice when the source of the advice was either known or unknown. When the advice source was unknown, participants indicated that they were significantly more willing to act on the LLM-generated advice. The result of the source unknown condition was replicated in Experiment 2. Intriguingly, despite participants indicating higher willingness to act on LLM-generated advice in Experiments 1 and 2, participants discriminated between the LLM- and lawyer-generated texts significantly above chance-level in Experiment 3. Lastly, we discuss potential explanations and risks of our findings, limitations and future work.
Paper Structure (38 sections, 5 figures, 1 table)

This paper contains 38 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Example of prompt for Traffic Law (#4) with corresponding LLM and Lawyer generated advice.
  • Figure 2: Example of prompt for Property Law (#2) with corresponding LLM and Lawyer generated advice.
  • Figure 3: Mean willingness to act ratings in Experiments 1 and 2. Error bars represent difference-adjusted, within-subjects, 95% confidence intervals Baguley:2012:CalculatingANOVA.
  • Figure 4: Example of prompt for Planning Law (#6) with corresponding LLM and Lawyer generated advice.
  • Figure 5: ROC curve for Experiment 3 indicating, through the bowing towards the top-left corner, above chance discrimination of LLM- and lawyer-generated legal advice. The area under the ROC curve (AUC) is highlighted in light blue (M = .59, SD = .18). Faint lines represent individual participant ROC curves.