Advanced Arabic Alphabet Sign Language Recognition Using Transfer Learning and Transformer Models
Mazen Balat, Rewaa Awaad, Hend Adel, Ahmed B. Zaky, Salah A. Aly
TL;DR
This work tackles Arabic Alphabet Sign Language recognition by leveraging transfer learning with both CNN and transformer architectures on two public datasets (ArSL2018 and AASL). The authors propose a unified framework with five pre-trained backbones (ResNet50, MobileNetV2, EfficientNetB7, ViT, Swin), employing fine-tuning and standard CNN/transformer adaptation to 28 Arabic alphabet classes. They report near state-of-the-art accuracies, achieving up to 99.6% on ArASL2018 and 99.43% on AASL, with transformer models offering the best accuracy at higher training cost. The results underscore the potential of transformer-based sign language recognition for Arabic and highlight avenues for real-time deployment and multilingual, robust sign-language assistive tools.
Abstract
This paper presents an Arabic Alphabet Sign Language recognition approach, using deep learning methods in conjunction with transfer learning and transformer-based models. We study the performance of the different variants on two publicly available datasets, namely ArSL2018 and AASL. This task will make full use of state-of-the-art CNN architectures like ResNet50, MobileNetV2, and EfficientNetB7, and the latest transformer models such as Google ViT and Microsoft Swin Transformer. These pre-trained models have been fine-tuned on the above datasets in an attempt to capture some unique features of Arabic sign language motions. Experimental results present evidence that the suggested methodology can receive a high recognition accuracy, by up to 99.6\% and 99.43\% on ArSL2018 and AASL, respectively. That is far beyond the previously reported state-of-the-art approaches. This performance opens up even more avenues for communication that may be more accessible to Arabic-speaking deaf and hard-of-hearing, and thus encourages an inclusive society.
