Table of Contents
Fetching ...

UQLegalAI@COLIEE2025: Advancing Legal Case Retrieval with Large Language Models and Graph Neural Networks

Yanran Tang, Ruihong Qiu, Zi Huang

TL;DR

The paper tackles legal case retrieval by proposing CaseLink, a graph-based framework that builds Global Case Graphs to encode intrinsic connectivity among cases and charges. Node texts are embedded with a dedicated large language model, and a graph neural network refines representations under an InfoNCE contrastive loss augmented by degree regularisation. On COLIEE 2025 Task 1, CaseLink achieves a strong second-place finish with an F1 of 0.2962, demonstrating stable performance and the value of incorporating cross-case relationships and charge-level structure. The approach offers practical implications for scalable, connectivity-aware retrieval in legal corpora and suggests avenues for future improvements with more powerful models and post-processing strategies.

Abstract

Legal case retrieval plays a pivotal role in the legal domain by facilitating the efficient identification of relevant cases, supporting legal professionals and researchers to propose legal arguments and make informed decision-making. To improve retrieval accuracy, the Competition on Legal Information Extraction and Entailment (COLIEE) is held annually, offering updated benchmark datasets for evaluation. This paper presents a detailed description of CaseLink, the method employed by UQLegalAI, the second highest team in Task 1 of COLIEE 2025. The CaseLink model utilises inductive graph learning and Global Case Graphs to capture the intrinsic case connectivity to improve the accuracy of legal case retrieval. Specifically, a large language model specialized in text embedding is employed to transform legal texts into embeddings, which serve as the feature representations of the nodes in the constructed case graph. A new contrastive objective, incorporating a regularization on the degree of case nodes, is proposed to leverage the information within the case reference relationship for model optimization. The main codebase used in our method is based on an open-sourced repo of CaseLink: https://github.com/yanran-tang/CaseLink.

UQLegalAI@COLIEE2025: Advancing Legal Case Retrieval with Large Language Models and Graph Neural Networks

TL;DR

The paper tackles legal case retrieval by proposing CaseLink, a graph-based framework that builds Global Case Graphs to encode intrinsic connectivity among cases and charges. Node texts are embedded with a dedicated large language model, and a graph neural network refines representations under an InfoNCE contrastive loss augmented by degree regularisation. On COLIEE 2025 Task 1, CaseLink achieves a strong second-place finish with an F1 of 0.2962, demonstrating stable performance and the value of incorporating cross-case relationships and charge-level structure. The approach offers practical implications for scalable, connectivity-aware retrieval in legal corpora and suggests avenues for future improvements with more powerful models and post-processing strategies.

Abstract

Legal case retrieval plays a pivotal role in the legal domain by facilitating the efficient identification of relevant cases, supporting legal professionals and researchers to propose legal arguments and make informed decision-making. To improve retrieval accuracy, the Competition on Legal Information Extraction and Entailment (COLIEE) is held annually, offering updated benchmark datasets for evaluation. This paper presents a detailed description of CaseLink, the method employed by UQLegalAI, the second highest team in Task 1 of COLIEE 2025. The CaseLink model utilises inductive graph learning and Global Case Graphs to capture the intrinsic case connectivity to improve the accuracy of legal case retrieval. Specifically, a large language model specialized in text embedding is employed to transform legal texts into embeddings, which serve as the feature representations of the nodes in the constructed case graph. A new contrastive objective, incorporating a regularization on the degree of case nodes, is proposed to leverage the information within the case reference relationship for model optimization. The main codebase used in our method is based on an open-sourced repo of CaseLink: https://github.com/yanran-tang/CaseLink.

Paper Structure

This paper contains 24 sections, 14 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: An example of a Global Case Graph is shown, where green nodes represent the two query cases $q_1$ and $q_2$, white nodes denote candidate cases $d_1\sim d_3$ and orange nodes correspond to legal charges $c_1\sim c_4$. The solid lines indicate the edges: Case-Case edges are shown in blue, Case-Charge edges in red, and Charge-Charge edges in yellow.
  • Figure 2: The overall framework of CaseLink caselink.