Binder: Hierarchical Concept Representation through Order Embedding of Binary Vectors
Croix Gyurek, Niloy Talukder, Mohammad Al Hasan
TL;DR
Binder introduces a binary, order-based embedding for hierarchical concepts by representing each concept as a $d$-bit vector in $\\{0,1\\}^d$ and enforcing that if $a$ is-a $b$ then $\mathbf{b}_j=1$ implies $\mathbf{a}_j=1$ for all $j$. It reframes learning as a binary CSP solved with a gradient-inspired, randomized bit-flip local search that runs in $O(ndT(|P|+|N|))$ time and uses a CSP-based loss with positive and negative constraints. Empirically, Binder achieves competitive representation accuracy across six datasets and substantially superior transitive-closure link prediction compared with nine baselines, while offering a dramatic memory footprint reduction (e.g., 2.36 MB vs 34.2 MB or more). The binary, explainable nature of Binder enables intuitive interpretation of attributes via bits and supports boolean operations (AND/OR), with strong scalability to large hierarchies like WordNet. Overall, Binder provides a compact, scalable, and effective alternative to continuous and box/Hyperbolic embeddings for hierarchical concept representation, with robust transitive inference from direct edges alone.
Abstract
For natural language understanding and generation, embedding concepts using an order-based representation is an essential task. Unlike traditional point vector based representation, an order-based representation imposes geometric constraints on the representation vectors for explicitly capturing various semantic relationships that may exist between a pair of concepts. In existing literature, several approaches on order-based embedding have been proposed, mostly focusing on capturing hierarchical relationships; examples include vectors in Euclidean space, complex, Hyperbolic, order, and Box Embedding. Box embedding creates region-based rich representation of concepts, but along the process it sacrifices simplicity, requiring a custom-made optimization scheme for learning the representation. Hyperbolic embedding improves embedding quality by exploiting the ever-expanding property of Hyperbolic space, but it also suffers from the same fate as box embedding as gradient descent like optimization is not simple in the Hyperbolic space. In this work, we propose Binder, a novel approach for order-based representation. Binder uses binary vectors for embedding, so the embedding vectors are compact with an order of magnitude smaller footprint than other methods. Binder uses a simple and efficient optimization scheme for learning representation vectors with a linear time complexity. Our comprehensive experimental results show that Binder is very accurate, yielding competitive results on the representation task. But Binder stands out from its competitors on the transitive closure link prediction task as it can learn concept embeddings just from the direct edges, whereas all existing order-based approaches rely on the indirect edges.
