KRAG Framework for Enhancing LLMs in the Legal Domain
Nguyen Ha Thanh, Ken Satoh
TL;DR
The paper addresses the challenge that large language models (LLMs) struggle with precise, reliable legal reasoning due to hallucinations and gaps in domain knowledge. It introduces Knowledge Representation Augmented Generation (KRAG), a framework that combines retrieval of domain-specific knowledge with structured reasoning over graphs to guide generation, implemented in Soft PROLEG for the legal domain. The approach is formalized with the KRAG equation $KRAG(p) = g(f_ ext{retrieval}(p, \\mathcal{K}), f_ ext{structure}(p, \\mathcal{G}))$ and the Soft PROLEG construct $A = SoftPROLEG (C_1, C_2, \\dots , C_n; E)$, enabling decomposed subconditions and explicit exceptions. Empirical PoC results show that KRAG-enabled backbones (GPT-3.5-SP, GPT-4-SP) attain higher accuracy and stability than baseline models, with improved consistency and explainability via graph-based reasoning, suggesting significant practical impact for automated legal analysis and potential cross-domain extension to other regulated fields.
Abstract
This paper introduces Knowledge Representation Augmented Generation (KRAG), a novel framework designed to enhance the capabilities of Large Language Models (LLMs) within domain-specific applications. KRAG points to the strategic inclusion of critical knowledge entities and relationships that are typically absent in standard data sets and which LLMs do not inherently learn. In the context of legal applications, we present Soft PROLEG, an implementation model under KRAG, which uses inference graphs to aid LLMs in delivering structured legal reasoning, argumentation, and explanations tailored to user inquiries. The integration of KRAG, either as a standalone framework or in tandem with retrieval augmented generation (RAG), markedly improves the ability of language models to navigate and solve the intricate challenges posed by legal texts and terminologies. This paper details KRAG's methodology, its implementation through Soft PROLEG, and potential broader applications, underscoring its significant role in advancing natural language understanding and processing in specialized knowledge domains.
