A Continual Relation Extraction Approach for Knowledge Graph Completeness
Sefika Efeoglu
TL;DR
The work addresses the challenge of maintaining complete and up-to-date knowledge graphs from streaming, non-stationary corona-news text. It proposes a weakly supervised online continual relation extraction framework, featuring a named-entity tagging component and a snowball-style RE learner augmented by KG and category embeddings and dependency parsing. The approach targets semantic drift, knowledge retention, and limited annotation to continuously discover and integrate new relation types, thereby enhancing KG completeness in real-world data streams. If successful, it enables more accurate, explainable relation extraction in health-domain news and supports continual updating of knowledge graphs for information management tasks.
Abstract
Representing unstructured data in a structured form is most significant for information system management to analyze and interpret it. To do this, the unstructured data might be converted into Knowledge Graphs, by leveraging an information extraction pipeline whose main tasks are named entity recognition and relation extraction. This thesis aims to develop a novel continual relation extraction method to identify relations (interconnections) between entities in a data stream coming from the real world. Domain-specific data of this thesis is corona news from German and Austrian newspapers.
