New keypoint-based approach for recognising British Sign Language (BSL) from sequences
Oishi Deb, KR Prajwal, Andrew Zisserman
TL;DR
This paper addresses recognizing British Sign Language words in continuous signing using a keypoint-based approach. It introduces a Transformer model that processes sequences of 2D keypoints extracted from face, hands, and pose with Mediapipe, achieving substantially lower computational cost than RGB-based methods while delivering a 60% top-5 accuracy on unseen data. The work demonstrates that keypoint representations can enable real-time, signer-independent BSL recognition and lays groundwork for future multimodal extensions and 3D pose estimation. The findings suggest significant practical potential for efficient, accessible sign-language recognition systems, with room for accuracy gains through modality fusion and advanced pose estimation techniques.
Abstract
In this paper, we present a novel keypoint-based classification model designed to recognise British Sign Language (BSL) words within continuous signing sequences. Our model's performance is assessed using the BOBSL dataset, revealing that the keypoint-based approach surpasses its RGB-based counterpart in computational efficiency and memory usage. Furthermore, it offers expedited training times and demands fewer computational resources. To the best of our knowledge, this is the inaugural application of a keypoint-based model for BSL word classification, rendering direct comparisons with existing works unavailable.
