Table of Contents
Fetching ...

Compositional Vector Space Models for Knowledge Base Completion

Arvind Neelakantan, Benjamin Roth, Andrew McCallum

TL;DR

The paper tackles knowledge base completion by learning compositional vector representations of multi-hop relation paths using a recurrent neural network. The path vectors are composed from relation embeddings and are matched to target relation embeddings to predict missing facts, enabling generalization to unseen paths and zero-shot learning through a general composition matrix and fixed relation vectors. A large-scale dataset (~52M triples) demonstrates that the proposed approach outperforms traditional path-based classifiers by up to 11% and pre-trained embedding baselines by up to 7%, with further gains when combined with a bigram-based baseline. The work provides a scalable method for path-based KB inference and shows practical zero-shot capabilities, contributing valuable insights for scalable relational learning in large knowledge graphs.

Abstract

Knowledge base (KB) completion adds new facts to a KB by making inferences from existing facts, for example by inferring with high likelihood nationality(X,Y) from bornIn(X,Y). Most previous methods infer simple one-hop relational synonyms like this, or use as evidence a multi-hop relational path treated as an atomic feature, like bornIn(X,Z) -> containedIn(Z,Y). This paper presents an approach that reasons about conjunctions of multi-hop relations non-atomically, composing the implications of a path using a recursive neural network (RNN) that takes as inputs vector embeddings of the binary relation in the path. Not only does this allow us to generalize to paths unseen at training time, but also, with a single high-capacity RNN, to predict new relation types not seen when the compositional model was trained (zero-shot learning). We assemble a new dataset of over 52M relational triples, and show that our method improves over a traditional classifier by 11%, and a method leveraging pre-trained embeddings by 7%.

Compositional Vector Space Models for Knowledge Base Completion

TL;DR

The paper tackles knowledge base completion by learning compositional vector representations of multi-hop relation paths using a recurrent neural network. The path vectors are composed from relation embeddings and are matched to target relation embeddings to predict missing facts, enabling generalization to unseen paths and zero-shot learning through a general composition matrix and fixed relation vectors. A large-scale dataset (~52M triples) demonstrates that the proposed approach outperforms traditional path-based classifiers by up to 11% and pre-trained embedding baselines by up to 7%, with further gains when combined with a bigram-based baseline. The work provides a scalable method for path-based KB inference and shows practical zero-shot capabilities, contributing valuable insights for scalable relational learning in large knowledge graphs.

Abstract

Knowledge base (KB) completion adds new facts to a KB by making inferences from existing facts, for example by inferring with high likelihood nationality(X,Y) from bornIn(X,Y). Most previous methods infer simple one-hop relational synonyms like this, or use as evidence a multi-hop relational path treated as an atomic feature, like bornIn(X,Z) -> containedIn(Z,Y). This paper presents an approach that reasons about conjunctions of multi-hop relations non-atomically, composing the implications of a path using a recursive neural network (RNN) that takes as inputs vector embeddings of the binary relation in the path. Not only does this allow us to generalize to paths unseen at training time, but also, with a single high-capacity RNN, to predict new relation types not seen when the compositional model was trained (zero-shot learning). We assemble a new dataset of over 52M relational triples, and show that our method improves over a traditional classifier by 11%, and a method leveraging pre-trained embeddings by 7%.

Paper Structure

This paper contains 15 sections, 6 equations, 2 figures, 4 tables, 1 algorithm.

Figures (2)

  • Figure 1: Semantically similar paths connecting entity pair (Microsoft, USA).
  • Figure 2: Vector Representations of the paths are computed by applying the composition function recursively.