Knowledge-Infused Legal Wisdom: Navigating LLM Consultation through the Lens of Diagnostics and Positive-Unlabeled Reinforcement Learning

Yang Wu; Chenghao Wang; Ece Gumusel; Xiaozhong Liu

Knowledge-Infused Legal Wisdom: Navigating LLM Consultation through the Lens of Diagnostics and Positive-Unlabeled Reinforcement Learning

Yang Wu, Chenghao Wang, Ece Gumusel, Xiaozhong Liu

TL;DR

The paper addresses the challenge that non-experts struggle to formulate professional queries to LLMs in legal contexts, proposing the Diagnostic Legal LLM (D3LM) which uses lawyer-like diagnostics to collect rich case information and generate high-quality court views. It introduces a Positive-Unlabeled Reinforcement Learning (PURL) framework that fuses domain PU models with LLMs for adaptive question generation, along with an LLM-based stopping criterion for precise CVG. A new English-language US CVG dataset is created to benchmark performance, including IRAC-based summaries and fact-rule graphs. Empirical results show that D3LM achieves superior automatic and human-evaluated performance compared to baselines, and usability studies indicate strong user acceptance, highlighting its potential to improve accuracy and reduce legal service costs. Limitations include domain specificity to criminal law, English-only scope, and high resource requirements, with future work aimed at cross-domain generalization, multilingual support, and efficiency optimization.

Abstract

The integration of generative Large Language Models (LLMs) into various applications, including the legal domain, has been accelerated by their expansive and versatile nature. However, when facing a legal case, users without a legal background often struggle to formulate professional queries and may inadvertently overlook critical legal factors when presenting their case narrative to LLMs. To address this issue, we propose the Diagnostic Legal Large Language Model (D3LM), which utilizes adaptive lawyer-like diagnostic questions to collect additional case information and then provides high-quality feedback. D3LM incorporates an innovative graph-based Positive-Unlabeled Reinforcement Learning (PURL) algorithm, enabling the generation of critical questions and enhancing user-LLM interactions. Moreover, an integrated LLM-based stopping criterion facilitates precise Court Views Generation (CVG). Our research also introduces a new English-language CVG dataset based on the US case law database, enriching the realm of LLM research and deployment with a vital dimension. D3LM surpasses classical LLMs by delivering outstanding performance and a remarkable user experience in the legal domain.

Knowledge-Infused Legal Wisdom: Navigating LLM Consultation through the Lens of Diagnostics and Positive-Unlabeled Reinforcement Learning

TL;DR

Abstract

Paper Structure (31 sections, 8 equations, 7 figures, 6 tables, 3 algorithms)

This paper contains 31 sections, 8 equations, 7 figures, 6 tables, 3 algorithms.

Introduction
Related Work
Legal Assistant
Enhanced Learning through PU Learning and Reinforcement Learning
Large Model Dataset Creation
Methodology
Problem Definition
D3LM Framework
Positive-unlabeled Reinforcement Learning (PURL) Question Generation
Domain Positive-Unlabeled Model Training
LLM Reading Comprehension
Model Training
Model Inference
Experiments
Data Construction
...and 16 more sections

Figures (7)

Figure 1: Comparison of legal service methodologies, highlighting traditional LLMs, lawyer consultations, and the D3LM model. D3LM innovatively generates professional questions, mirroring the actions of a lawyer, to improve legal outcome accuracy without high costs, demonstrating a cost-effective, precise approach to legal assistance.
Figure 2: D3LM Model Framework Overview: Illustrates D3LM's engagement through context-driven questions, guided by the PURL algorithm from continuous and historical dialogues. Aims to collect comprehensive case details until a fine-tuned LLM token signals adequate information acquisition.
Figure 3: Illustrative Representation of the PURL Network in Action. This diagram showcases the PURL algorithm's training process using $Case_i$ as an example. It visually delineates the sequential steps of extracting, summarizing, and reconstructing case facts, followed by question generation. Specifically, the collaboration of the LLM with the domain-specific PU model facilitates the final adaptive node selection through NeuralUCB.
Figure 4: Case Study.
Figure 5: Reliability result
...and 2 more figures

Knowledge-Infused Legal Wisdom: Navigating LLM Consultation through the Lens of Diagnostics and Positive-Unlabeled Reinforcement Learning

TL;DR

Abstract

Knowledge-Infused Legal Wisdom: Navigating LLM Consultation through the Lens of Diagnostics and Positive-Unlabeled Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (7)