Transfer Learning for Cross-dataset Isolated Sign Language Recognition in Under-Resourced Datasets
Ahmet Alp Kindiroglu, Ozgur Kara, Ogulcan Ozdemir, Lale Akarun
TL;DR
This work tackles cross-dataset isolated sign-language recognition for under-resourced languages by establishing a public cross-dataset transfer-learning benchmark using two Turkish datasets (BSign22k and AUTSL) with 57 shared signs. It employs a coordinate-based SL-GCN pipeline fed by OpenPose joints and evaluates five supervised transfer methods under closed-set and partial-set transfer scenarios. The study shows that specialized supervised transfer approaches (MCC, JAN, DSBN, DANN) can surpass finetuning, especially when target data is scarce or when shared labels exist, and that partial-set transfer benefits from larger source vocabularies. Overall, the paper provides a replicable benchmark and demonstrates meaningful gains for cross-dataset SLR, aiding research on under-resourced sign languages and related video classification tasks.
Abstract
Sign language recognition (SLR) has recently achieved a breakthrough in performance thanks to deep neural networks trained on large annotated sign datasets. Of the many different sign languages, these annotated datasets are only available for a select few. Since acquiring gloss-level labels on sign language videos is difficult, learning by transferring knowledge from existing annotated sources is useful for recognition in under-resourced sign languages. This study provides a publicly available cross-dataset transfer learning benchmark from two existing public Turkish SLR datasets. We use a temporal graph convolution-based sign language recognition approach to evaluate five supervised transfer learning approaches and experiment with closed-set and partial-set cross-dataset transfer learning. Experiments demonstrate that improvement over finetuning based transfer learning is possible with specialized supervised transfer learning methods.
