Table of Contents
Fetching ...

Adaptive Cost-Efficient Evaluation for Reliable Patent Claim Validation

Yongmin Yoo, Qiongkai Xu, Longbing Cao

Abstract

Automated validation of patent claims demands zero-defect tolerance, as even a single structural flaw can render a claim legally defective. Existing evaluation paradigms suffer from a rigidity-resource dilemma: lightweight encoders struggle with nuanced legal dependencies, while exhaustive verification via Large Language Models (LLMs) is prohibitively costly. To bridge this gap, we propose ACE (Adaptive Cost-efficient Evaluation), a hybrid framework that uses predictive entropy to route only high-uncertainty claims to an expert LLM. The expert then executes a Chain of Patent Thought (CoPT) protocol grounded in 35 U.S.C. statutory standards. This design enables ACE to handle long-range legal dependencies more effectively while preserving efficiency. ACE achieves the best F1 among the evaluated methods at 94.95\%, while reducing operational costs by 78\% compared to standalone LLM deployments. We also construct ACE-40k, a 40,000-claim benchmark with MPEP-grounded error annotations, to facilitate further research.

Adaptive Cost-Efficient Evaluation for Reliable Patent Claim Validation

Abstract

Automated validation of patent claims demands zero-defect tolerance, as even a single structural flaw can render a claim legally defective. Existing evaluation paradigms suffer from a rigidity-resource dilemma: lightweight encoders struggle with nuanced legal dependencies, while exhaustive verification via Large Language Models (LLMs) is prohibitively costly. To bridge this gap, we propose ACE (Adaptive Cost-efficient Evaluation), a hybrid framework that uses predictive entropy to route only high-uncertainty claims to an expert LLM. The expert then executes a Chain of Patent Thought (CoPT) protocol grounded in 35 U.S.C. statutory standards. This design enables ACE to handle long-range legal dependencies more effectively while preserving efficiency. ACE achieves the best F1 among the evaluated methods at 94.95\%, while reducing operational costs by 78\% compared to standalone LLM deployments. We also construct ACE-40k, a 40,000-claim benchmark with MPEP-grounded error annotations, to facilitate further research.

Paper Structure

This paper contains 54 sections, 8 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Patent claim validation requires zero-defect tolerance. Existing paradigms face a rigidity–resource dilemma between encoder-only efficiency and LLM-only reasoning strength.
  • Figure 2: The proposed ACE framework. It integrates a high-throughput Gatekeeper with a deep-reasoning Expert LLM via uncertainty-based routing. High-entropy claims ($U > \tau$) trigger the CoPT protocol for rigorous verification, while unambiguous claims are processed via the Fast Path, optimizing accuracy and latency.
  • Figure 3: ROC curves comparing the ACE Gatekeeper against various baselines. The near-ideal curve of the Gatekeeper (AUC=0.9716) demonstrates its superior ability to distinguish claim validity compared to domain-specific encoders and lexical baselines.
  • Figure 4: Risk-Coverage Trade-off Analysis. The dual-axis chart visualizes the relationship between the escalation rate, predictive performance (Retained F1 on non-escalated samples, red line), and operational cost (Estimated Cost, blue bars).
  • Figure 5: Fine-grained Performance Analysis of the Gatekeeper. While robust on Dependency patterns, it struggles with Antecedent Basis (lowest Recall/F1), confirming the "Antecedent Bottleneck" in tracking long-range entity lifecycles.