Knowledge-Driven Cross-Document Relation Extraction
Monika Jain, Raghava Mutharaju, Kuldeep Singh, Ramakanth Kavuluru
TL;DR
The paper tackles cross-document relation extraction (CrossDocRE) by embedding domain knowledge into the reasoning process. It introduces KXDocRE, which leverages entity type context (EC), connecting path context (CC), or their combination (ECC) along with entity- and relevance-based filters, an encoder, a cross-path relation matrix, a Transformer, a classifier, and an explanation module. Experiments on CodRED show consistent improvements over prior CrossDocRE methods, with ECC providing the strongest gains and the explanations offering interpretable justification for predictions. This work advances practical CrossDocRE by improving accuracy and transparency in multi-document knowledge discovery tasks.
Abstract
Relation extraction (RE) is a well-known NLP application often treated as a sentence- or document-level task. However, a handful of recent efforts explore it across documents or in the cross-document setting (CrossDocRE). This is distinct from the single document case because different documents often focus on disparate themes, while text within a document tends to have a single goal. Linking findings from disparate documents to identify new relationships is at the core of the popular literature-based knowledge discovery paradigm in biomedicine and other domains. Current CrossDocRE efforts do not consider domain knowledge, which are often assumed to be known to the reader when documents are authored. Here, we propose a novel approach, KXDocRE, that embed domain knowledge of entities with input text for cross-document RE. Our proposed framework has three main benefits over baselines: 1) it incorporates domain knowledge of entities along with documents' text; 2) it offers interpretability by producing explanatory text for predicted relations between entities 3) it improves performance over the prior methods.
