The Numerical Stability of Hyperbolic Representation Learning
Gal Mishne, Zhengchao Wan, Yusu Wang, Sheng Yang
TL;DR
This work analyzes the numerical stability of hyperbolic representation learning, comparing the Poincaré ball and Lorentz model and identifying optimization-driven advantages for the Lorentz model despite its smaller representational capacity. It introduces a Euclidean parametrization of hyperbolic space that preserves full capacity while yielding optimization dynamics similar to the Lorentz model, and extends this approach to hyperplanes and a new hyperbolic SVM formulation (LSVMPP). The authors provide theoretical results on gradient behavior, radius-based representation limits, and isometric transitions between models, complemented by empirical evidence from tree embeddings and multiclass SVM tasks. The proposed Euclidean parametrization improves robustness and performance, offering practical guidelines for stable hyperbolic learning and scalable hierarchical representation.
Abstract
Given the exponential growth of the volume of the ball w.r.t. its radius, the hyperbolic space is capable of embedding trees with arbitrarily small distortion and hence has received wide attention for representing hierarchical datasets. However, this exponential growth property comes at a price of numerical instability such that training hyperbolic learning models will sometimes lead to catastrophic NaN problems, encountering unrepresentable values in floating point arithmetic. In this work, we carefully analyze the limitation of two popular models for the hyperbolic space, namely, the Poincaré ball and the Lorentz model. We first show that, under the 64 bit arithmetic system, the Poincaré ball has a relatively larger capacity than the Lorentz model for correctly representing points. Then, we theoretically validate the superiority of the Lorentz model over the Poincaré ball from the perspective of optimization. Given the numerical limitations of both models, we identify one Euclidean parametrization of the hyperbolic space which can alleviate these limitations. We further extend this Euclidean parametrization to hyperbolic hyperplanes and exhibits its ability in improving the performance of hyperbolic SVM.
