Lorentzian Residual Neural Networks
Neil He, Menglin Yang, Rex Ying
TL;DR
This work introduces LResNet, a Lorentzian residual network that embeds residual connections directly on the Lorentz hyperboloid using a weighted Lorentzian centroid. By eliminating mappings to tangent spaces and parallel transport, LResNet achieves superior numerical stability, commutativity, and computational efficiency, while preserving hyperbolic structure and enabling theoretical derivations of prior methods. The approach is demonstrated across graph neural networks, graph transformers, and vision models, yielding consistent improvements over Euclidean and existing hyperbolic residual methods, and showing remarkable speedups in computation. These results highlight LResNet's potential to enable more expressive and robust hyperbolic architectures across diverse domains, with broad applicability to CNNs, GNNs, and graph Transformers.
Abstract
Hyperbolic neural networks have emerged as a powerful tool for modeling hierarchical data structures prevalent in real-world datasets. Notably, residual connections, which facilitate the direct flow of information across layers, have been instrumental in the success of deep neural networks. However, current methods for constructing hyperbolic residual networks suffer from limitations such as increased model complexity, numerical instability, and errors due to multiple mappings to and from the tangent space. To address these limitations, we introduce LResNet, a novel Lorentzian residual neural network based on the weighted Lorentzian centroid in the Lorentz model of hyperbolic geometry. Our method enables the efficient integration of residual connections in Lorentz hyperbolic neural networks while preserving their hierarchical representation capabilities. We demonstrate that our method can theoretically derive previous methods while offering improved stability, efficiency, and effectiveness. Extensive experiments on both graph and vision tasks showcase the superior performance and robustness of our method compared to state-of-the-art Euclidean and hyperbolic alternatives. Our findings highlight the potential of LResNet for building more expressive neural networks in hyperbolic embedding space as a generally applicable method to multiple architectures, including CNNs, GNNs, and graph Transformers.
