Table of Contents
Fetching ...

ELLA: Empowering LLMs for Interpretable, Accurate and Informative Legal Advice

Yutong Hu, Kangcheng Luo, Yansong Feng

TL;DR

This work addresses the reliability gap in legal LLM advice by introducing ELLA, a framework that grounds LLM responses in retrieved legal articles and cases and provides sentence-level interpretability. It comprises four components—Chat Interface, Interactive Legal Article Selection, Response Interpretation, and Legal Case Retrieval—supported by fine-tuned embedding models ($BGE_1$, $BGE_2$) and a threshold-based explainability mechanism ($Thr_1=0.85$, $Thr_2=0.65$). Automated evaluation shows that the fine-tuned embedding for interpretation improves ranking metrics (NDCG@K), while a user study demonstrates that interactive article selection and case retrieval enhance accuracy and readability, despite some noise in top-3 article selections. The approach promises more trustworthy legal consultations and can be extended to other jurisdictions by incorporating additional legal knowledge sources and more advanced retrieval modules.

Abstract

Despite remarkable performance in legal consultation exhibited by legal Large Language Models(LLMs) combined with legal article retrieval components, there are still cases when the advice given is incorrect or baseless. To alleviate these problems, we propose {\bf ELLA}, a tool for {\bf E}mpowering {\bf L}LMs for interpretable, accurate, and informative {\bf L}egal {\bf A}dvice. ELLA visually presents the correlation between legal articles and LLM's response by calculating their similarities, providing users with an intuitive legal basis for the responses. Besides, based on the users' queries, ELLA retrieves relevant legal articles and displays them to users. Users can interactively select legal articles for LLM to generate more accurate responses. ELLA also retrieves relevant legal cases for user reference. Our user study shows that presenting the legal basis for the response helps users understand better. The accuracy of LLM's responses also improves when users intervene in selecting legal articles for LLM. Providing relevant legal cases also aids individuals in obtaining comprehensive information.

ELLA: Empowering LLMs for Interpretable, Accurate and Informative Legal Advice

TL;DR

This work addresses the reliability gap in legal LLM advice by introducing ELLA, a framework that grounds LLM responses in retrieved legal articles and cases and provides sentence-level interpretability. It comprises four components—Chat Interface, Interactive Legal Article Selection, Response Interpretation, and Legal Case Retrieval—supported by fine-tuned embedding models (, ) and a threshold-based explainability mechanism (, ). Automated evaluation shows that the fine-tuned embedding for interpretation improves ranking metrics (NDCG@K), while a user study demonstrates that interactive article selection and case retrieval enhance accuracy and readability, despite some noise in top-3 article selections. The approach promises more trustworthy legal consultations and can be extended to other jurisdictions by incorporating additional legal knowledge sources and more advanced retrieval modules.

Abstract

Despite remarkable performance in legal consultation exhibited by legal Large Language Models(LLMs) combined with legal article retrieval components, there are still cases when the advice given is incorrect or baseless. To alleviate these problems, we propose {\bf ELLA}, a tool for {\bf E}mpowering {\bf L}LMs for interpretable, accurate, and informative {\bf L}egal {\bf A}dvice. ELLA visually presents the correlation between legal articles and LLM's response by calculating their similarities, providing users with an intuitive legal basis for the responses. Besides, based on the users' queries, ELLA retrieves relevant legal articles and displays them to users. Users can interactively select legal articles for LLM to generate more accurate responses. ELLA also retrieves relevant legal cases for user reference. Our user study shows that presenting the legal basis for the response helps users understand better. The accuracy of LLM's responses also improves when users intervene in selecting legal articles for LLM. Providing relevant legal cases also aids individuals in obtaining comprehensive information.
Paper Structure (22 sections, 4 figures, 5 tables)

This paper contains 22 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Examples of incomplete, incorrect, inconsistent Response. $A_i$ indicates the $i_{th}$ article in Civil Code. Blue articles mean they are relevant to the query, while orange ones are irrelevant. The blue star means the article is retrieved for LLM. We only show the key information in the Figure. For the complete conversations, please refer to Appendix \ref{['app:conversation']}
  • Figure 2: Screenshot of ELLA. We show the complete conversation in Appendix \ref{['app:conversation']}, Table \ref{['table:q1']} and Table \ref{['table:q4']}.
  • Figure 3: The system architecture overview.
  • Figure 4: Schematic of Dataset Construction. The blue sentences indicate the sentences with the highest BM25 scores and the orange sentences are the most irrelevant ones. Blue lines indicate positive cases and orange lines indicate negative cases.