Table of Contents
Fetching ...

DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment

Haitao Li, Qingyao Ai, Xinyan Han, Jia Chen, Qian Dong, Yiqun Liu, Chong Chen, Qi Tian

TL;DR

DELTA tackles legal case retrieval by learning discriminative embeddings that emphasize key facts rather than plain textual similarity. It introduces Structural Word Alignment (SWA) and a dual-encoder with shallow decoders to induce information bottlenecks, plus a deep decoder to translate between case structures for alignment, trained with unsupervised contrastive learning. Empirical results on Chinese and English benchmarks show state-of-the-art performance in zero-shot and fine-tuning settings, highlighting improved discriminability and interpretability of retrieved precedents. The approach provides a practical, structure-aware framework for more reliable legal case retrieval and insight into aligning fact-focused representations with legal reasoning.

Abstract

Recent research demonstrates the effectiveness of using pre-trained language models for legal case retrieval. Most of the existing works focus on improving the representation ability for the contextualized embedding of the [CLS] token and calculate relevance using textual semantic similarity. However, in the legal domain, textual semantic similarity does not always imply that the cases are relevant enough. Instead, relevance in legal cases primarily depends on the similarity of key facts that impact the final judgment. Without proper treatments, the discriminative ability of learned representations could be limited since legal cases are lengthy and contain numerous non-key facts. To this end, we introduce DELTA, a discriminative model designed for legal case retrieval. The basic idea involves pinpointing key facts in legal cases and pulling the contextualized embedding of the [CLS] token closer to the key facts while pushing away from the non-key facts, which can warm up the case embedding space in an unsupervised manner. To be specific, this study brings the word alignment mechanism to the contextual masked auto-encoder. First, we leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability. Second, we employ the deep decoder to enable translation between different structures, with the goal of pinpointing key facts to enhance discriminative ability. Comprehensive experiments conducted on publicly available legal benchmarks show that our approach can outperform existing state-of-the-art methods in legal case retrieval. It provides a new perspective on the in-depth understanding and processing of legal case documents.

DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment

TL;DR

DELTA tackles legal case retrieval by learning discriminative embeddings that emphasize key facts rather than plain textual similarity. It introduces Structural Word Alignment (SWA) and a dual-encoder with shallow decoders to induce information bottlenecks, plus a deep decoder to translate between case structures for alignment, trained with unsupervised contrastive learning. Empirical results on Chinese and English benchmarks show state-of-the-art performance in zero-shot and fine-tuning settings, highlighting improved discriminability and interpretability of retrieved precedents. The approach provides a practical, structure-aware framework for more reliable legal case retrieval and insight into aligning fact-focused representations with legal reasoning.

Abstract

Recent research demonstrates the effectiveness of using pre-trained language models for legal case retrieval. Most of the existing works focus on improving the representation ability for the contextualized embedding of the [CLS] token and calculate relevance using textual semantic similarity. However, in the legal domain, textual semantic similarity does not always imply that the cases are relevant enough. Instead, relevance in legal cases primarily depends on the similarity of key facts that impact the final judgment. Without proper treatments, the discriminative ability of learned representations could be limited since legal cases are lengthy and contain numerous non-key facts. To this end, we introduce DELTA, a discriminative model designed for legal case retrieval. The basic idea involves pinpointing key facts in legal cases and pulling the contextualized embedding of the [CLS] token closer to the key facts while pushing away from the non-key facts, which can warm up the case embedding space in an unsupervised manner. To be specific, this study brings the word alignment mechanism to the contextual masked auto-encoder. First, we leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability. Second, we employ the deep decoder to enable translation between different structures, with the goal of pinpointing key facts to enhance discriminative ability. Comprehensive experiments conducted on publicly available legal benchmarks show that our approach can outperform existing state-of-the-art methods in legal case retrieval. It provides a new perspective on the in-depth understanding and processing of legal case documents.
Paper Structure (31 sections, 18 equations, 4 figures, 6 tables)

This paper contains 31 sections, 18 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: An illustrative example of relevance in legal case retrieval. The key facts in these cases are highlighted in red and the non-key facts in blue. Despite sharing a substantial number of words, these two cases are legally irrelevant. The divergence in key facts results in distinctly different judgments.
  • Figure 2: Pre-training designs of DELTA. DELTA creates information bottlenecks with two shallow decoders to improve representing ability of $[CLS]$ vector. Furthermore, Structural Word Alignment task is employed to identify key facts. DELTA pulls $[CLS]$ vectors closer to key facts and pushes them away from the non-key facts to enhance discriminative ability.
  • Figure 3: An example showing the key facts and non-key facts determined by DELTA. The words in red are key facts, and black are non-key facts.
  • Figure 4: Visual analysis of SAILER and DELTA in zero-shot setting.