An Open-Source American Sign Language Fingerspell Recognition and Semantic Pose Retrieval Interface

Kevin Jose Thomas

An Open-Source American Sign Language Fingerspell Recognition and Semantic Pose Retrieval Interface

Kevin Jose Thomas

TL;DR

This work tackles the need for accessible ASL translation tools by delivering an open-source interface that separately handles fingerspell recognition and semantic pose retrieval. Recognition leverages Google MediaPipe landmarks with two classifiers (a lightweight 2D CNN and a 3D PointNet) to convert fingerspelling into spoken English, followed by BERT-based syntactic correction. Production maps spoken English to ASL gloss via an LLM and retrieves sign poses semantically through a pgvector-embedded pose database, stitching together pose sequences for fluent signing. The system operates in real time and is robust to varying backgrounds, lighting, skin tones, and hand sizes, offering a practical stepping-stone toward full ASL translation and enabling developers to build accessible sign-language-enabled applications.

Abstract

This paper introduces an open-source interface for American Sign Language fingerspell recognition and semantic pose retrieval, aimed to serve as a stepping stone towards more advanced sign language translation systems. Utilizing a combination of convolutional neural networks and pose estimation models, the interface provides two modular components: a recognition module for translating ASL fingerspelling into spoken English and a production module for converting spoken English into ASL pose sequences. The system is designed to be highly accessible, user-friendly, and capable of functioning in real-time under varying environmental conditions like backgrounds, lighting, skin tones, and hand sizes. We discuss the technical details of the model architecture, application in the wild, as well as potential future enhancements for real-world consumer applications.

An Open-Source American Sign Language Fingerspell Recognition and Semantic Pose Retrieval Interface

TL;DR

Abstract

An Open-Source American Sign Language Fingerspell Recognition and Semantic Pose Retrieval Interface

Authors

TL;DR

Abstract

Table of Contents

Figures (9)