Table of Contents
Fetching ...

Co-Matching: Towards Human-Machine Collaborative Legal Case Matching

Chen Huang, Xinwei Yang, Yang Deng, Wenqiang Lei, JianCheng Lv, Tat-Seng Chua

TL;DR

This paper tackles legal case matching, which hinges on tacit expertise from legal practitioners that is hard to codify for machines. It introduces Co-Matching, a framework where both the practitioner and an off-the-shelf machine jointly identify key sentences and a probabilistic fusion mechanism combines their signals, guided by a novel ProtoEM module that estimates human decision uncertainty without ground truth. ProtoEM clusters historical decisions into prototypes and uses EM to learn prototype-level confusion matrices, enabling real-time, uncertainty-aware collaboration. Experiments on ELAM and eCAIL show Co-Matching consistently outperforms both human-only and machine-only baselines, with notable gains in accuracy and robustness across practitioners of varying tacit knowledge, highlighting the value of human-in-the-loop collaboration in high-stakes legal NLP tasks.

Abstract

Recent efforts have aimed to improve AI machines in legal case matching by integrating legal domain knowledge. However, successful legal case matching requires the tacit knowledge of legal practitioners, which is difficult to verbalize and encode into machines. This emphasizes the crucial role of involving legal practitioners in high-stakes legal case matching. To address this, we propose a collaborative matching framework called Co-Matching, which encourages both the machine and the legal practitioner to participate in the matching process, integrating tacit knowledge. Unlike existing methods that rely solely on the machine, Co-Matching allows both the legal practitioner and the machine to determine key sentences and then combine them probabilistically. Co-Matching introduces a method called ProtoEM to estimate human decision uncertainty, facilitating the probabilistic combination. Experimental results demonstrate that Co-Matching consistently outperforms existing legal case matching methods, delivering significant performance improvements over human- and machine-based matching in isolation (on average, +5.51% and +8.71%, respectively). Further analysis shows that Co-Matching also ensures better human-machine collaboration effectiveness. Our study represents a pioneering effort in human-machine collaboration for the matching task, marking a milestone for future collaborative matching studies.

Co-Matching: Towards Human-Machine Collaborative Legal Case Matching

TL;DR

This paper tackles legal case matching, which hinges on tacit expertise from legal practitioners that is hard to codify for machines. It introduces Co-Matching, a framework where both the practitioner and an off-the-shelf machine jointly identify key sentences and a probabilistic fusion mechanism combines their signals, guided by a novel ProtoEM module that estimates human decision uncertainty without ground truth. ProtoEM clusters historical decisions into prototypes and uses EM to learn prototype-level confusion matrices, enabling real-time, uncertainty-aware collaboration. Experiments on ELAM and eCAIL show Co-Matching consistently outperforms both human-only and machine-only baselines, with notable gains in accuracy and robustness across practitioners of varying tacit knowledge, highlighting the value of human-in-the-loop collaboration in high-stakes legal NLP tasks.

Abstract

Recent efforts have aimed to improve AI machines in legal case matching by integrating legal domain knowledge. However, successful legal case matching requires the tacit knowledge of legal practitioners, which is difficult to verbalize and encode into machines. This emphasizes the crucial role of involving legal practitioners in high-stakes legal case matching. To address this, we propose a collaborative matching framework called Co-Matching, which encourages both the machine and the legal practitioner to participate in the matching process, integrating tacit knowledge. Unlike existing methods that rely solely on the machine, Co-Matching allows both the legal practitioner and the machine to determine key sentences and then combine them probabilistically. Co-Matching introduces a method called ProtoEM to estimate human decision uncertainty, facilitating the probabilistic combination. Experimental results demonstrate that Co-Matching consistently outperforms existing legal case matching methods, delivering significant performance improvements over human- and machine-based matching in isolation (on average, +5.51% and +8.71%, respectively). Further analysis shows that Co-Matching also ensures better human-machine collaboration effectiveness. Our study represents a pioneering effort in human-machine collaboration for the matching task, marking a milestone for future collaborative matching studies.
Paper Structure (27 sections, 6 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 27 sections, 6 equations, 7 figures, 2 tables, 1 algorithm.

Figures (7)

  • Figure 1: Legal practitioners' tacit knowledge (e.g., work experience) is critical for legal case matching. Human-machine collaborative matching combines the strengths of both the legal practitioner and the machine, leading to enhanced matching results.
  • Figure 2: Co-Matching allows both the legal practitioner and the machine to decide on the key sentences and then combine these sentences in a probabilistic manner, while alleviating the limitations of manual quantitative measurement (Challenge 1, marked with a light blue background). Co-Matching introduces the ProtoEM to estimate human behavior uncertainty and facilitate the probabilistic decision combination (Challenge 2, marked with a white background on the right side).
  • Figure 3: Matching performance when collaborating with legal practitioners of varying levels of tacit knowledge. Co-Matching has strong adaptability to different legal practitioners compared to other baselines. Non-experts hinder the realization of human-machine complementarity due to their lack of legal tacit knowledge.
  • Figure 4: Illustration on uncertainty estimation error. Co-Matching enjoys lower estimation error, thanks to the finer-grained uncertainty estimation via ProtoEM.
  • Figure 5: Illustration on key sentence identification accuracy. Co-Matching enjoys higher accuracy than other human-machine based baselines.
  • ...and 2 more figures