Table of Contents
Fetching ...

Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph

Xiaochen Kev Gao, Feng Yao, Kewen Zhao, Beilei He, Animesh Kumar, Vish Krishnan, Jingbo Shang

TL;DR

This work challenges the assumption that scaling up language models automatically improves domain-specific tasks, showing that patent approval prediction does not benefit from larger LLM backbones or prompt engineering. It introduces the Fine-grained cLAim depeNdency (FLAN) Graph to capture both inner-claim and inter-claim dependencies by decomposing claims into sub-components and linking related segments. Through extensive experiments on the large PatentAP dataset, FLAN Graphs combined with lightweight graph neural networks (notably GraphSage) achieve substantial gains over the previous SOTA, while embedding- and prompt-based LLM approaches underperform or are cost-inefficient. The results underscore the value of domain-specific graph structures for complex, knowledge-intensive tasks and suggest directions for future work to explain LLM failures and generalize the approach to broader patent data.

Abstract

Model scaling is becoming the default choice for many language tasks due to the success of large language models (LLMs). However, it can fall short in specific scenarios where simple customized methods excel. In this paper, we delve into the patent approval pre-diction task and unveil that simple domain-specific graph methods outperform enlarging the model, using the intrinsic dependencies within the patent data. Specifically, we first extend the embedding-based state-of-the-art (SOTA) by scaling up its backbone model with various sizes of open-source LLMs, then explore prompt-based methods to harness proprietary LLMs' potential, but find the best results close to random guessing, underlining the ineffectiveness of model scaling-up. Hence, we propose a novel Fine-grained cLAim depeNdency (FLAN) Graph through meticulous patent data analyses, capturing the inherent dependencies across segments of the patent text. As it is model-agnostic, we apply cost-effective graph models to our FLAN Graph to obtain representations for approval prediction. Extensive experiments and detailed analyses prove that incorporating FLAN Graph via various graph models consistently outperforms all LLM baselines significantly. We hope that our observations and analyses in this paper can bring more attention to this challenging task and prompt further research into the limitations of LLMs. Our source code and dataset can be obtained from http://github.com/ShangDataLab/FLAN-Graph.

Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph

TL;DR

This work challenges the assumption that scaling up language models automatically improves domain-specific tasks, showing that patent approval prediction does not benefit from larger LLM backbones or prompt engineering. It introduces the Fine-grained cLAim depeNdency (FLAN) Graph to capture both inner-claim and inter-claim dependencies by decomposing claims into sub-components and linking related segments. Through extensive experiments on the large PatentAP dataset, FLAN Graphs combined with lightweight graph neural networks (notably GraphSage) achieve substantial gains over the previous SOTA, while embedding- and prompt-based LLM approaches underperform or are cost-inefficient. The results underscore the value of domain-specific graph structures for complex, knowledge-intensive tasks and suggest directions for future work to explain LLM failures and generalize the approach to broader patent data.

Abstract

Model scaling is becoming the default choice for many language tasks due to the success of large language models (LLMs). However, it can fall short in specific scenarios where simple customized methods excel. In this paper, we delve into the patent approval pre-diction task and unveil that simple domain-specific graph methods outperform enlarging the model, using the intrinsic dependencies within the patent data. Specifically, we first extend the embedding-based state-of-the-art (SOTA) by scaling up its backbone model with various sizes of open-source LLMs, then explore prompt-based methods to harness proprietary LLMs' potential, but find the best results close to random guessing, underlining the ineffectiveness of model scaling-up. Hence, we propose a novel Fine-grained cLAim depeNdency (FLAN) Graph through meticulous patent data analyses, capturing the inherent dependencies across segments of the patent text. As it is model-agnostic, we apply cost-effective graph models to our FLAN Graph to obtain representations for approval prediction. Extensive experiments and detailed analyses prove that incorporating FLAN Graph via various graph models consistently outperforms all LLM baselines significantly. We hope that our observations and analyses in this paper can bring more attention to this challenging task and prompt further research into the limitations of LLMs. Our source code and dataset can be obtained from http://github.com/ShangDataLab/FLAN-Graph.
Paper Structure (46 sections, 10 figures, 9 tables)

This paper contains 46 sections, 10 figures, 9 tables.

Figures (10)

  • Figure 1: An illustration for the patent approval prediction task approached by LLMs and graph models, where each node of the graph is an informative segment decomposed from the original claim text.
  • Figure 2: A brief example of the typical patent claim writing style and hierarchical dependencies within claims from a real-world patent application.
  • Figure 3: Flowchart of constructing FLAN Graph. Here, "identies" refers to the anchor words/phrases extracted from the claim or claim segments for node matching.
  • Figure 4: FLAN Graph for Claim 2 in Figure \ref{['fig:claim_short']}. Here, the blue texts are the "identies" for node matching. Nodes with red background are directly derived from Claim 2 while the rest ones are inherited from Claim 1.
  • Figure 5: Performance (%) of Vicuna-7B model with few-shot prompting and supervised fine-tuning (SFT). Here, SFT does not include any few-shot examples.
  • ...and 5 more figures