Table of Contents
Fetching ...

BdSL-SPOTER: A Transformer-Based Framework for Bengali Sign Language Recognition with Cultural Adaptation

Sayad Ibna Azad, Md. Atiqur Rahman

TL;DR

BdSL-SPOTER tackles Bengali Sign Language recognition under data scarcity by introducing a culturally adapted pose-based transformer with four encoder layers and learnable positional encodings. It combines BdSL-specific pose normalization, curriculum learning, and an efficient architecture to achieve $97.92\%$ Top-1 accuracy on BdSLW60, a $22.82\%$ improvement over the Bi-LSTM baseline, while maintaining a small footprint (~$3.2\text{ MB}$) and real-time throughput ($127\text{ FPS}$ on A100). The approach reduces training time by $63.1\%$ and parameters by $29.4\%$ relative to SPOTER, enabled by 4-layer encoders and 9-head attention, and validated via 5-fold cross-validation with strong statistical effects ($p<0.001$, Cohen's $d=2.84$). This work highlights the importance of cultural adaptation and data-efficient transformer designs for low-resource regional sign languages and offers a scalable blueprint for extending to other languages.

Abstract

We introduce BdSL-SPOTER, a pose-based transformer framework for accurate and efficient recognition of Bengali Sign Language (BdSL). BdSL-SPOTER extends the SPOTER paradigm with cultural specific preprocessing and a compact four-layer transformer encoder featuring optimized learnable positional encodings, while employing curriculum learning to enhance generalization on limited data and accelerate convergence. On the BdSLW60 benchmark, it achieves 97.92% Top-1 validation accuracy, representing a 22.82% improvement over the Bi-LSTM baseline, all while keeping computational costs low. With its reduced number of parameters, lower FLOPs, and higher FPS, BdSL-SPOTER provides a practical framework for real-world accessibility applications and serves as a scalable model for other low-resource regional sign languages.

BdSL-SPOTER: A Transformer-Based Framework for Bengali Sign Language Recognition with Cultural Adaptation

TL;DR

BdSL-SPOTER tackles Bengali Sign Language recognition under data scarcity by introducing a culturally adapted pose-based transformer with four encoder layers and learnable positional encodings. It combines BdSL-specific pose normalization, curriculum learning, and an efficient architecture to achieve Top-1 accuracy on BdSLW60, a improvement over the Bi-LSTM baseline, while maintaining a small footprint (~) and real-time throughput ( on A100). The approach reduces training time by and parameters by relative to SPOTER, enabled by 4-layer encoders and 9-head attention, and validated via 5-fold cross-validation with strong statistical effects (, Cohen's ). This work highlights the importance of cultural adaptation and data-efficient transformer designs for low-resource regional sign languages and offers a scalable blueprint for extending to other languages.

Abstract

We introduce BdSL-SPOTER, a pose-based transformer framework for accurate and efficient recognition of Bengali Sign Language (BdSL). BdSL-SPOTER extends the SPOTER paradigm with cultural specific preprocessing and a compact four-layer transformer encoder featuring optimized learnable positional encodings, while employing curriculum learning to enhance generalization on limited data and accelerate convergence. On the BdSLW60 benchmark, it achieves 97.92% Top-1 validation accuracy, representing a 22.82% improvement over the Bi-LSTM baseline, all while keeping computational costs low. With its reduced number of parameters, lower FLOPs, and higher FPS, BdSL-SPOTER provides a practical framework for real-world accessibility applications and serves as a scalable model for other low-resource regional sign languages.

Paper Structure

This paper contains 9 sections, 6 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: BdSL-SPOTER: (1) Preprocessing Pipeline, (2) 4-layer Transformer Encoders, (3) Classification Head.
  • Figure 2: Confusion matrix illustrating per-class accuracy.
  • Figure 3: Training dynamics of BdSL-SPOTER.