Table of Contents
Fetching ...

PatentEdits: Framing Patent Novelty as Textual Entailment

Ryan Lee, Alexander Spangher, Xuezhe Ma

TL;DR

This work designs algorithms to label edits sentence by sentence, then establishes how well these edits can be predicted with large language models (LLMs) and demonstrates that evaluating textual entailment between cited references and draft sentences is especially effective in predicting which inventive claims remained unchanged or are novel in relation to prior art.

Abstract

A patent must be deemed novel and non-obvious in order to be granted by the US Patent Office (USPTO). If it is not, a US patent examiner will cite the prior work, or prior art, that invalidates the novelty and issue a non-final rejection. Predicting what claims of the invention should change given the prior art is an essential and crucial step in securing invention rights, yet has not been studied before as a learnable task. In this work we introduce the PatentEdits dataset, which contains 105K examples of successful revisions that overcome objections to novelty. We design algorithms to label edits sentence by sentence, then establish how well these edits can be predicted with large language models (LLMs). We demonstrate that evaluating textual entailment between cited references and draft sentences is especially effective in predicting which inventive claims remained unchanged or are novel in relation to prior art.

PatentEdits: Framing Patent Novelty as Textual Entailment

TL;DR

This work designs algorithms to label edits sentence by sentence, then establishes how well these edits can be predicted with large language models (LLMs) and demonstrates that evaluating textual entailment between cited references and draft sentences is especially effective in predicting which inventive claims remained unchanged or are novel in relation to prior art.

Abstract

A patent must be deemed novel and non-obvious in order to be granted by the US Patent Office (USPTO). If it is not, a US patent examiner will cite the prior work, or prior art, that invalidates the novelty and issue a non-final rejection. Predicting what claims of the invention should change given the prior art is an essential and crucial step in securing invention rights, yet has not been studied before as a learnable task. In this work we introduce the PatentEdits dataset, which contains 105K examples of successful revisions that overcome objections to novelty. We design algorithms to label edits sentence by sentence, then establish how well these edits can be predicted with large language models (LLMs). We demonstrate that evaluating textual entailment between cited references and draft sentences is especially effective in predicting which inventive claims remained unchanged or are novel in relation to prior art.

Paper Structure

This paper contains 20 sections, 1 equation, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: Simplified patent application timeline. PatentEdits aligns text data from the draft, cited references and final patents to understand patent revision.
  • Figure 2: Shown are the extracted edit labels for US Patent 8677435. On the left are draft claims and on the right are final patent claims, with edges denoting a sentence match.
  • Figure 3: We obtain high quality retrievals of relevant citations with an LLM, treat it as the positive, the draft sentence as the anchor, and the final sentence as the negative, for triplet loss. We treat the final patent sentence as a negative example as we assume it has been rewritten to be semantically different from the draft. One training example for the retriever is one triplet of sentences.