Elevating Legal LLM Responses: Harnessing Trainable Logical Structures and Semantic Knowledge with Legal Reasoning
Rujing Yao, Yang Wu, Chenghao Wang, Jingwei Xiong, Fang Wang, Xiaozhong Liu
TL;DR
This work tackles the reliability gap of large language models in legal question answering by integrating logical reasoning with semantic retrieval. The proposed Logical-Semantic Integration Model (LSIM) combines a learnable fact-rule chain, a supervised DSSM-powered RAG component, and in-context learning to generate precise, legally coherent answers. Empirical results on a real-world criminal-law dataset show that LSIM outperforms strong baselines across automatic metrics and human evaluation, under multiple LLM backbones, highlighting improved accuracy and reliability. The approach enhances practical impact by aligning AI-generated legal advice more closely with professional reasoning, while also offering a pathway to extend to other specialized domains and multi-turn interactions.
Abstract
Large Language Models (LLMs) have achieved impressive results across numerous domains, yet they experience notable deficiencies in legal question-answering tasks. LLMs often generate generalized responses that lack the logical specificity required for expert legal advice and are prone to hallucination, providing answers that appear correct but are unreliable. Retrieval-Augmented Generation (RAG) techniques offer partial solutions to address this challenge, but existing approaches typically focus only on semantic similarity, neglecting the logical structure essential to legal reasoning. In this paper, we propose the Logical-Semantic Integration Model (LSIM), a novel supervised framework that bridges semantic and logical coherence. LSIM comprises three components: reinforcement learning predicts a structured fact-rule chain for each question, a trainable Deep Structured Semantic Model (DSSM) retrieves the most relevant candidate questions by integrating semantic and logical features, and in-context learning generates the final answer using the retrieved content. Our experiments on a real-world legal QA dataset-validated through both automated metrics and human evaluation-demonstrate that LSIM significantly enhances accuracy and reliability compared to existing methods.
