Table of Contents
Fetching ...

Multi-Relational Hyperbolic Word Embeddings from Natural Language Definitions

Marco Valentino, Danilo S. Carvalho, André Freitas

TL;DR

This paper tackles the challenge of embedding words in a space that preserves the explicit hierarchical and relational structure found in natural language definitions. It introduces a multi-relational translational model in Hyperbolic space, powered by a Definition Semantic Roles labeller that automatically extracts (definiendum, relation, definiens) triples from WordNet glosses. Empirically, hyperbolic multi-relational embeddings outperform Euclidean and several baseline approaches on word similarity benchmarks, demonstrate strong one-shot OOV encoding, and offer interpretable latent space traversals and relation-aware transformations. The work highlights the efficiency and interpretability advantages of the hyperbolic, definition-driven approach, suggesting practical benefits for controllable semantic navigation and definitional reasoning in NLP.

Abstract

Natural language definitions possess a recursive, self-explanatory semantic structure that can support representation learning methods able to preserve explicit conceptual relations and constraints in the latent space. This paper presents a multi-relational model that explicitly leverages such a structure to derive word embeddings from definitions. By automatically extracting the relations linking defined and defining terms from dictionaries, we demonstrate how the problem of learning word embeddings can be formalised via a translational framework in Hyperbolic space and used as a proxy to capture the global semantic structure of definitions. An extensive empirical analysis demonstrates that the framework can help imposing the desired structural constraints while preserving the semantic mapping required for controllable and interpretable traversal. Moreover, the experiments reveal the superiority of the Hyperbolic word embeddings over the Euclidean counterparts and demonstrate that the multi-relational approach can obtain competitive results when compared to state-of-the-art neural models, with the advantage of being intrinsically more efficient and interpretable.

Multi-Relational Hyperbolic Word Embeddings from Natural Language Definitions

TL;DR

This paper tackles the challenge of embedding words in a space that preserves the explicit hierarchical and relational structure found in natural language definitions. It introduces a multi-relational translational model in Hyperbolic space, powered by a Definition Semantic Roles labeller that automatically extracts (definiendum, relation, definiens) triples from WordNet glosses. Empirically, hyperbolic multi-relational embeddings outperform Euclidean and several baseline approaches on word similarity benchmarks, demonstrate strong one-shot OOV encoding, and offer interpretable latent space traversals and relation-aware transformations. The work highlights the efficiency and interpretability advantages of the hyperbolic, definition-driven approach, suggesting practical benefits for controllable semantic navigation and definitional reasoning in NLP.

Abstract

Natural language definitions possess a recursive, self-explanatory semantic structure that can support representation learning methods able to preserve explicit conceptual relations and constraints in the latent space. This paper presents a multi-relational model that explicitly leverages such a structure to derive word embeddings from definitions. By automatically extracting the relations linking defined and defining terms from dictionaries, we demonstrate how the problem of learning word embeddings can be formalised via a translational framework in Hyperbolic space and used as a proxy to capture the global semantic structure of definitions. An extensive empirical analysis demonstrates that the framework can help imposing the desired structural constraints while preserving the semantic mapping required for controllable and interpretable traversal. Moreover, the experiments reveal the superiority of the Hyperbolic word embeddings over the Euclidean counterparts and demonstrate that the multi-relational approach can obtain competitive results when compared to state-of-the-art neural models, with the advantage of being intrinsically more efficient and interpretable.
Paper Structure (25 sections, 6 equations, 2 figures, 6 tables)

This paper contains 25 sections, 6 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: How can we inject the recursive, hierarchical structure of natural language definitions into word embeddings? This paper investigates Hyperbolic manifolds to learn multi-relational representations exclusively from definitions, formalising the problem via a translational framework to preserve the semantic mapping between concepts in latent space.
  • Figure 2: An overview of the multi-relational framework for learning word embeddings from definitions. The methodology consists of two main phases: (A) building a specialised semantic role labeller (DSRL) for the annotation of natural language definitions and the extraction of relations from large dictionaries; (B) formalising the learning problem as a link prediction task via a translational framework. The translational formulation acts as a proxy for minimising the distance between words that are connected in the definitions (e.g., line and set) while preserving the semantic relations for interpretable and controllable traversal of the space.