Multi-Relational Hyperbolic Word Embeddings from Natural Language Definitions
Marco Valentino, Danilo S. Carvalho, André Freitas
TL;DR
This paper tackles the challenge of embedding words in a space that preserves the explicit hierarchical and relational structure found in natural language definitions. It introduces a multi-relational translational model in Hyperbolic space, powered by a Definition Semantic Roles labeller that automatically extracts (definiendum, relation, definiens) triples from WordNet glosses. Empirically, hyperbolic multi-relational embeddings outperform Euclidean and several baseline approaches on word similarity benchmarks, demonstrate strong one-shot OOV encoding, and offer interpretable latent space traversals and relation-aware transformations. The work highlights the efficiency and interpretability advantages of the hyperbolic, definition-driven approach, suggesting practical benefits for controllable semantic navigation and definitional reasoning in NLP.
Abstract
Natural language definitions possess a recursive, self-explanatory semantic structure that can support representation learning methods able to preserve explicit conceptual relations and constraints in the latent space. This paper presents a multi-relational model that explicitly leverages such a structure to derive word embeddings from definitions. By automatically extracting the relations linking defined and defining terms from dictionaries, we demonstrate how the problem of learning word embeddings can be formalised via a translational framework in Hyperbolic space and used as a proxy to capture the global semantic structure of definitions. An extensive empirical analysis demonstrates that the framework can help imposing the desired structural constraints while preserving the semantic mapping required for controllable and interpretable traversal. Moreover, the experiments reveal the superiority of the Hyperbolic word embeddings over the Euclidean counterparts and demonstrate that the multi-relational approach can obtain competitive results when compared to state-of-the-art neural models, with the advantage of being intrinsically more efficient and interpretable.
