Table of Contents
Fetching ...

Knowledge-Driven Cross-Document Relation Extraction

Monika Jain, Raghava Mutharaju, Kuldeep Singh, Ramakanth Kavuluru

TL;DR

The paper tackles cross-document relation extraction (CrossDocRE) by embedding domain knowledge into the reasoning process. It introduces KXDocRE, which leverages entity type context (EC), connecting path context (CC), or their combination (ECC) along with entity- and relevance-based filters, an encoder, a cross-path relation matrix, a Transformer, a classifier, and an explanation module. Experiments on CodRED show consistent improvements over prior CrossDocRE methods, with ECC providing the strongest gains and the explanations offering interpretable justification for predictions. This work advances practical CrossDocRE by improving accuracy and transparency in multi-document knowledge discovery tasks.

Abstract

Relation extraction (RE) is a well-known NLP application often treated as a sentence- or document-level task. However, a handful of recent efforts explore it across documents or in the cross-document setting (CrossDocRE). This is distinct from the single document case because different documents often focus on disparate themes, while text within a document tends to have a single goal. Linking findings from disparate documents to identify new relationships is at the core of the popular literature-based knowledge discovery paradigm in biomedicine and other domains. Current CrossDocRE efforts do not consider domain knowledge, which are often assumed to be known to the reader when documents are authored. Here, we propose a novel approach, KXDocRE, that embed domain knowledge of entities with input text for cross-document RE. Our proposed framework has three main benefits over baselines: 1) it incorporates domain knowledge of entities along with documents' text; 2) it offers interpretability by producing explanatory text for predicted relations between entities 3) it improves performance over the prior methods.

Knowledge-Driven Cross-Document Relation Extraction

TL;DR

The paper tackles cross-document relation extraction (CrossDocRE) by embedding domain knowledge into the reasoning process. It introduces KXDocRE, which leverages entity type context (EC), connecting path context (CC), or their combination (ECC) along with entity- and relevance-based filters, an encoder, a cross-path relation matrix, a Transformer, a classifier, and an explanation module. Experiments on CodRED show consistent improvements over prior CrossDocRE methods, with ECC providing the strongest gains and the explanations offering interpretable justification for predictions. This work advances practical CrossDocRE by improving accuracy and transparency in multi-document knowledge discovery tasks.

Abstract

Relation extraction (RE) is a well-known NLP application often treated as a sentence- or document-level task. However, a handful of recent efforts explore it across documents or in the cross-document setting (CrossDocRE). This is distinct from the single document case because different documents often focus on disparate themes, while text within a document tends to have a single goal. Linking findings from disparate documents to identify new relationships is at the core of the popular literature-based knowledge discovery paradigm in biomedicine and other domains. Current CrossDocRE efforts do not consider domain knowledge, which are often assumed to be known to the reader when documents are authored. Here, we propose a novel approach, KXDocRE, that embed domain knowledge of entities with input text for cross-document RE. Our proposed framework has three main benefits over baselines: 1) it incorporates domain knowledge of entities along with documents' text; 2) it offers interpretability by producing explanatory text for predicted relations between entities 3) it improves performance over the prior methods.
Paper Structure (18 sections, 1 equation, 8 figures, 6 tables, 1 algorithm)

This paper contains 18 sections, 1 equation, 8 figures, 6 tables, 1 algorithm.

Figures (8)

  • Figure 1: Three text paths indicate the relationship path between the source entity, GCompris, and the target entity, GNU Project. These connections are established through pairs of documents, where one document features the source entity, and the other contains the target entity. In each path, the connection between the source and target entities is led by a mentioned entity in both documents (e.g., Linux).
  • Figure 2: Architecture diagram of KXDocRE for cross-document relation extraction. Here EC represents entity context, CC represents connecting context and ECC represents both entity and connecting path context.
  • Figure 3: Context path constructed from Wikidata between Jim Lynagh (source) and Irish Republic (target).
  • Figure 4: An example of a co-occurring graph for path 3 in Figure \ref{['fig:mesh1']}.
  • Figure 5: Study of relevance and entity based filter on KXDocRE
  • ...and 3 more figures