Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign Language Translation
Edward Fish, Richard Bowden
TL;DR
Geo-Sign tackles SLT by embedding skeletal motion into a hyperbolic space to respect the hierarchical structure of sign kinematics. It projects ST-GCN skeletal features into the Poincaré ball with a learnable curvature $c$ and regularises a pre-trained mT5 translator via a geometry-aware contrastive loss, explored through global and token-based alignment strategies. The approach yields significant gains on CSL-Daily (e.g., BLEU-4 and ROUGE-L improvements over pose baselines) and is competitive with RGB methods, while preserving signer privacy and preserving inference-time efficiency. The work demonstrates that hyperbolic geometry provides a principled inductive bias for discriminating fine-grained hand articulations and broader body movements, with potential extensions to multiple sign languages and other skeletal-based tasks.
Abstract
Recent progress in Sign Language Translation (SLT) has focussed primarily on improving the representational capacity of large language models to incorporate Sign Language features. This work explores an alternative direction: enhancing the geometric properties of skeletal representations themselves. We propose Geo-Sign, a method that leverages the properties of hyperbolic geometry to model the hierarchical structure inherent in sign language kinematics. By projecting skeletal features derived from Spatio-Temporal Graph Convolutional Networks (ST-GCNs) into the Poincaré ball model, we aim to create more discriminative embeddings, particularly for fine-grained motions like finger articulations. We introduce a hyperbolic projection layer, a weighted Fréchet mean aggregation scheme, and a geometric contrastive loss operating directly in hyperbolic space. These components are integrated into an end-to-end translation framework as a regularisation function, to enhance the representations within the language model. This work demonstrates the potential of hyperbolic geometry to improve skeletal representations for Sign Language Translation, improving on SOTA RGB methods while preserving privacy and improving computational efficiency. Code available here: https://github.com/ed-fish/geo-sign.
