Dynamic LIBRAS Gesture Recognition via CNN over Spatiotemporal Matrix Representation

Jasmine Moreira

Dynamic LIBRAS Gesture Recognition via CNN over Spatiotemporal Matrix Representation

Jasmine Moreira

Abstract

This paper proposes a method for dynamic hand gesture recognition based on the composition of two models: the MediaPipe Hand Landmarker, responsible for extracting 21 skeletal keypoints of the hand, and a convolutional neural network (CNN) trained to classify gestures from a spatiotemporal matrix representation of dimensions 90 by 21 of those keypoints. The method is applied to the recognition of LIBRAS (Brazilian Sign Language) gestures for device control in a home automation system, covering 11 classes of static and dynamic gestures. For real-time inference, a sliding window with temporal frame triplication is used, enabling continuous recognition without recurrent networks. Tests achieved 95\% accuracy under low-light conditions and 92\% under normal lighting. The results indicate that the approach is effective, although systematic experiments with greater user diversity are needed for a more thorough evaluation of generalization.

Dynamic LIBRAS Gesture Recognition via CNN over Spatiotemporal Matrix Representation

Abstract

Dynamic LIBRAS Gesture Recognition via CNN over Spatiotemporal Matrix Representation

Abstract

Paper Structure

Table of Contents

Figures (10)