Revolutionizing Communication with Deep Learning and XAI for Enhanced Arabic Sign Language Recognition

Mazen Balat; Rewaa Awaad; Ahmed B. Zaky; Salah A. Aly

Revolutionizing Communication with Deep Learning and XAI for Enhanced Arabic Sign Language Recognition

Mazen Balat, Rewaa Awaad, Ahmed B. Zaky, Salah A. Aly

TL;DR

This paper tackles Arabic Sign Language recognition by combining multiple deep learning architectures (MobileNetV3, ResNet50, EfficientNet-B2) with Explainable AI through Grad-CAM to enhance interpretability. It evaluates on two datasets, ArSL2018 and AASL, addressing class imbalance via undersampling for ArSL2018 and extensive data augmentation, and employs stratified 5-fold cross-validation to ensure robust generalization. EfficientNet-B2 achieves the top performance, reaching $99.48\%$ on ArSL2018 and $98.99\%$ on AASL, while Grad-CAM visualizations provide transparent explanations of model decisions. The results demonstrate strong accuracy and interpretability, with potential impact on healthcare, education, and inclusive communication, and pave the way for broader, multilingual sign-language recognition systems.

Abstract

This study introduces an integrated approach to recognizing Arabic Sign Language (ArSL) using state-of-the-art deep learning models such as MobileNetV3, ResNet50, and EfficientNet-B2. These models are further enhanced by explainable AI (XAI) techniques to boost interpretability. The ArSL2018 and RGB Arabic Alphabets Sign Language (AASL) datasets are employed, with EfficientNet-B2 achieving peak accuracies of 99.48\% and 98.99\%, respectively. Key innovations include sophisticated data augmentation methods to mitigate class imbalance, implementation of stratified 5-fold cross-validation for better generalization, and the use of Grad-CAM for clear model decision transparency. The proposed system not only sets new benchmarks in recognition accuracy but also emphasizes interpretability, making it suitable for applications in healthcare, education, and inclusive communication technologies.

Revolutionizing Communication with Deep Learning and XAI for Enhanced Arabic Sign Language Recognition

TL;DR

on ArSL2018 and

on AASL, while Grad-CAM visualizations provide transparent explanations of model decisions. The results demonstrate strong accuracy and interpretability, with potential impact on healthcare, education, and inclusive communication, and pave the way for broader, multilingual sign-language recognition systems.

Abstract

Paper Structure (36 sections, 7 equations, 25 figures, 16 tables)

This paper contains 36 sections, 7 equations, 25 figures, 16 tables.

Introduction
Related Works
Datasets
Arabic Alphabets Sign Language Dataset (ArASL2018)
RGB Arabic Alphabets Sign Language Dataset (AASL)
Methodology
Data Preparation
Handling Class Imbalance: ArSL2018 and AASL Datasets
Preprocessing
Data Splitting
Stratified Splitting
Model Training and Description
Model Selection
MobileNetV3
ResNet50
...and 21 more sections

Figures (25)

Figure 1: Work Flow Diagram
Figure 2: Examples of images from the ArSL2018 dataset
Figure 3: Class distribution of the ArSL2018 dataset
Figure 4: Examples of images from the AASL dataset
Figure 5: Class distribution of the AASL dataset
...and 20 more figures

Revolutionizing Communication with Deep Learning and XAI for Enhanced Arabic Sign Language Recognition

TL;DR

Abstract

Revolutionizing Communication with Deep Learning and XAI for Enhanced Arabic Sign Language Recognition

Authors

TL;DR

Abstract

Table of Contents

Figures (25)