Table of Contents
Fetching ...

PatSTEG: Modeling Formation Dynamics of Patent Citation Networks via The Semantic-Topological Evolutionary Graph

Ran Miao, Xueyu Chen, Liang Hu, Zhifei Zhang, Minghua Wan, Qi Zhang, Cairong Zhao

TL;DR

PatSTEG addresses the gap in patent citation analysis by jointly modeling semantic content and citation topology in a dynamic graph. It introduces two coupled evolution operators and text-aware node representations to predict future citations, particularly in sparse CNPatG data. Empirical results on CNPatG and public datasets show that PatSTEG with text and dynamic propagation outperforms baselines and provides interpretable, multi-aspect citation insights. This enables improved patent literature mining and technology trend analysis across IPC fields.

Abstract

Patent documents in the patent database (PatDB) are crucial for research, development, and innovation as they contain valuable technical information. However, PatDB presents a multifaceted challenge compared to publicly available preprocessed databases due to the intricate nature of the patent text and the inherent sparsity within the patent citation network. Although patent text analysis and citation analysis bring new opportunities to explore patent data mining, no existing work exploits the complementation of them. To this end, we propose a joint semantic-topological evolutionary graph learning approach (PatSTEG) to model the formation dynamics of patent citation networks. More specifically, we first create a real-world dataset of Chinese patents named CNPat and leverage its patent texts and citations to construct a patent citation network. Then, PatSTEG is modeled to study the evolutionary dynamics of patent citation formation by considering the semantic and topological information jointly. Extensive experiments are conducted on CNPat and public datasets to prove the superiority of PatSTEG over other state-of-the-art methods. All the results provide valuable references for patent literature research and technical exploration.

PatSTEG: Modeling Formation Dynamics of Patent Citation Networks via The Semantic-Topological Evolutionary Graph

TL;DR

PatSTEG addresses the gap in patent citation analysis by jointly modeling semantic content and citation topology in a dynamic graph. It introduces two coupled evolution operators and text-aware node representations to predict future citations, particularly in sparse CNPatG data. Empirical results on CNPatG and public datasets show that PatSTEG with text and dynamic propagation outperforms baselines and provides interpretable, multi-aspect citation insights. This enables improved patent literature mining and technology trend analysis across IPC fields.

Abstract

Patent documents in the patent database (PatDB) are crucial for research, development, and innovation as they contain valuable technical information. However, PatDB presents a multifaceted challenge compared to publicly available preprocessed databases due to the intricate nature of the patent text and the inherent sparsity within the patent citation network. Although patent text analysis and citation analysis bring new opportunities to explore patent data mining, no existing work exploits the complementation of them. To this end, we propose a joint semantic-topological evolutionary graph learning approach (PatSTEG) to model the formation dynamics of patent citation networks. More specifically, we first create a real-world dataset of Chinese patents named CNPat and leverage its patent texts and citations to construct a patent citation network. Then, PatSTEG is modeled to study the evolutionary dynamics of patent citation formation by considering the semantic and topological information jointly. Extensive experiments are conducted on CNPat and public datasets to prove the superiority of PatSTEG over other state-of-the-art methods. All the results provide valuable references for patent literature research and technical exploration.
Paper Structure (15 sections, 18 equations, 4 figures, 4 tables)

This paper contains 15 sections, 18 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: (a) The similar keyword distributions of patents both related to chip technology, separately from patents with plenty of citations and patents with few citations. (b) The out-degree distribution of CNPatG is more imbalanced than Cora.
  • Figure 2: The architecture of the PatSTEG model with the evolutionary learning process.
  • Figure 3: The performances of different embedding sizes.
  • Figure 4: The different topics of citations.