Seg-HGNN: Unsupervised and Light-Weight Image Segmentation with Hyperbolic Graph Neural Networks
Debjyoti Mondal, Rahul Mishra, Chandan Pandey
TL;DR
Seg-HGNN develops a lightweight, unsupervised image segmentation framework that operates in hyperbolic space to capture latent hierarchical structure with minimal parameters. By projecting patch-level features from a frozen transformer into the Lorentz model and applying a single hyperbolic graph convolutional layer, the method achieves competitive localization and segmentation results on standard benchmarks while using less than 7.5k trainable parameters and enabling fast inference on common GPUs. The approach combines a Lorentz linear transform, Einstein midpoint aggregation, and a relaxed normalized-cut loss to cluster patches without supervision. This work demonstrates that hyperbolic representations offer compact, scalable, and effective image analysis, with practical implications for edge-friendly vision systems and real-time applications.
Abstract
Image analysis in the euclidean space through linear hyperspaces is well studied. However, in the quest for more effective image representations, we turn to hyperbolic manifolds. They provide a compelling alternative to capture complex hierarchical relationships in images with remarkably small dimensionality. To demonstrate hyperbolic embeddings' competence, we introduce a light-weight hyperbolic graph neural network for image segmentation, encompassing patch-level features in a very small embedding size. Our solution, Seg-HGNN, surpasses the current best unsupervised method by 2.5\%, 4\% on VOC-07, VOC-12 for localization, and by 0.8\%, 1.3\% on CUB-200, ECSSD for segmentation, respectively. With less than 7.5k trainable parameters, Seg-HGNN delivers effective and fast ($\approx 2$ images/second) results on very standard GPUs like the GTX1650. This empirical evaluation presents compelling evidence of the efficacy and potential of hyperbolic representations for vision tasks.
