Inducing Relational Knowledge from BERT
Zied Bouraoui, Jose Camacho-Collados, Steven Schockaert
TL;DR
The paper investigates whether BERT encodes relational knowledge beyond standard word embeddings and introduces a three-stage pipeline to distill this knowledge: mine trigger sentences from a large corpus, filter templates that express a relation, and fine-tune BERT to classify new word pairs for that relation using instantiated templates. Experiments on semantic, encyclopedic, and commonsense relations show strong gains over word-vector baselines for certain relations, while morphological relations remain difficult. A key finding is that automatic template discovery and the quality of contextual templates substantially impact performance, underscoring BERT's encoded relational information. Overall, the work demonstrates an automated approach to extract relational knowledge from BERT without manual templates, highlighting its potential for enriching knowledge bases with commonsense and factual relations.
Abstract
One of the most remarkable properties of word embeddings is the fact that they capture certain types of semantic and syntactic relationships. Recently, pre-trained language models such as BERT have achieved groundbreaking results across a wide range of Natural Language Processing tasks. However, it is unclear to what extent such models capture relational knowledge beyond what is already captured by standard word embeddings. To explore this question, we propose a methodology for distilling relational knowledge from a pre-trained language model. Starting from a few seed instances of a given relation, we first use a large text corpus to find sentences that are likely to express this relation. We then use a subset of these extracted sentences as templates. Finally, we fine-tune a language model to predict whether a given word pair is likely to be an instance of some relation, when given an instantiated template for that relation as input.
