Table of Contents
Fetching ...

CaseLink: Inductive Graph Learning for Legal Case Retrieval

Yanran Tang, Ruihong Qiu, Hongzhi Yin, Xue Li, Zi Huang

TL;DR

A CaseLink model based on inductive graph learning is proposed to utilise the intrinsic case connectivity for legal case retrieval, a novel Global Case Graph is incorporated to represent both the case semantic relationship and the case legal charge relationship, and a novel contrastive objective with a regularisation on the degree of case nodes is proposed to optimise the model.

Abstract

In case law, the precedents are the relevant cases that are used to support the decisions made by the judges and the opinions of lawyers towards a given case. This relevance is referred to as the case-to-case reference relation. To efficiently find relevant cases from a large case pool, retrieval tools are widely used by legal practitioners. Existing legal case retrieval models mainly work by comparing the text representations of individual cases. Although they obtain a decent retrieval accuracy, the intrinsic case connectivity relationships among cases have not been well exploited for case encoding, therefore limiting the further improvement of retrieval performance. In a case pool, there are three types of case connectivity relationships: the case reference relationship, the case semantic relationship, and the case legal charge relationship. Due to the inductive manner in the task of legal case retrieval, using case reference as input is not applicable for testing. Thus, in this paper, a CaseLink model based on inductive graph learning is proposed to utilise the intrinsic case connectivity for legal case retrieval, a novel Global Case Graph is incorporated to represent both the case semantic relationship and the case legal charge relationship. A novel contrastive objective with a regularisation on the degree of case nodes is proposed to leverage the information carried by the case reference relationship to optimise the model. Extensive experiments have been conducted on two benchmark datasets, which demonstrate the state-of-the-art performance of CaseLink. The code has been released on https://github.com/yanran-tang/CaseLink.

CaseLink: Inductive Graph Learning for Legal Case Retrieval

TL;DR

A CaseLink model based on inductive graph learning is proposed to utilise the intrinsic case connectivity for legal case retrieval, a novel Global Case Graph is incorporated to represent both the case semantic relationship and the case legal charge relationship, and a novel contrastive objective with a regularisation on the degree of case nodes is proposed to optimise the model.

Abstract

In case law, the precedents are the relevant cases that are used to support the decisions made by the judges and the opinions of lawyers towards a given case. This relevance is referred to as the case-to-case reference relation. To efficiently find relevant cases from a large case pool, retrieval tools are widely used by legal practitioners. Existing legal case retrieval models mainly work by comparing the text representations of individual cases. Although they obtain a decent retrieval accuracy, the intrinsic case connectivity relationships among cases have not been well exploited for case encoding, therefore limiting the further improvement of retrieval performance. In a case pool, there are three types of case connectivity relationships: the case reference relationship, the case semantic relationship, and the case legal charge relationship. Due to the inductive manner in the task of legal case retrieval, using case reference as input is not applicable for testing. Thus, in this paper, a CaseLink model based on inductive graph learning is proposed to utilise the intrinsic case connectivity for legal case retrieval, a novel Global Case Graph is incorporated to represent both the case semantic relationship and the case legal charge relationship. A novel contrastive objective with a regularisation on the degree of case nodes is proposed to leverage the information carried by the case reference relationship to optimise the model. Extensive experiments have been conducted on two benchmark datasets, which demonstrate the state-of-the-art performance of CaseLink. The code has been released on https://github.com/yanran-tang/CaseLink.
Paper Structure (52 sections, 13 equations, 8 figures, 6 tables)

This paper contains 52 sections, 13 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: The inductive nature of case reference in legal case retrieval. (a) During training, a labelled dataset contains query cases (green nodes), candidate cases (white nodes), and the ground truth reference between queries and candidates (solid edges). For simplicity, edges are denoted as undirected. (b) During inductive testing, given an unlabelled and unseen dataset with new query cases (blue nodes) and candidate cases (grey nodes), legal case retrieval models are expected to uncover case references in dashed edges.
  • Figure 2: An illustration of Global Case Graph. Green nodes are query cases $q_1$ and $q_2$, white nodes are candidate cases $d_1\sim d_3$ and orange nodes are legal charges $c_1\sim c_4$. Solid lines are edges, including case-case edges in blue, case-charge edges in red and charge-charge edges in yellow.
  • Figure 3: The comparison between typical LCR models and CaseLink. (a) Existing LCR models generally apply a text encoder to query and candidate individually. The LCR prediction is obtained by perform nearest neighbour search on these non-interactive encodings. (b) The overall framework of CaseLink. During training, the training queries, the training candidates and the charges are transformed into a Global Case Graph (GCG). A graph neural network (GNN) module will conduct the node feature update for the GCG. The updated query and candidate node features will be fed into the contrastive learning (InfoNCE) objective and the degree regularisation (DegReg) objective to train the CaseLink model. During inference, the testing queries, the testing candidates and the charges are transformed into another GCG. After obtaining the updated node features with the GNN module, the retrieval result is achieved by the nearest neighbour search based on the similarity among these case node features.
  • Figure 4: Parameter sensitivity of $\lambda$ in Equation (\ref{['eq:loss-overall']}).
  • Figure 5: Parameter sensitivity of TopK BM25 neighbours in case-case edge construction from Equation (\ref{['eq:d-edge']}).
  • ...and 3 more figures