Table of Contents
Fetching ...

Proper Body Landmark Subset Enables More Accurate and 5X Faster Recognition of Isolated Signs in LIBRAS

Daniele L. V. dos Santos, Thiago B. Pereira, Carlos Eduardo G. R. Alves, Richard J. M. G. Tello, Francisco de A. Boldt, Thiago M. Paixão

TL;DR

The paper tackles isolated sign recognition in LIBRAS using a lightweight landmark pipeline based on MediaPipe, addressing the computational bottleneck of OpenPose. It introduces landmark-subset selection strategies and spline-based imputation to maintain accuracy while achieving significant speedups, outperforming a strong prior method in several benchmarks. The ASL-2nd landmark subset, combined with spline imputation and Skeleton-DML encoding, yields robust performance and substantial time efficiency, enabling scalable real-time ISLR. This approach opens avenues for deploying efficient LIBRAS recognition in practical applications and invites extension to other sign languages and continuous SL scenarios.

Abstract

This paper investigates the feasibility of using lightweight body landmark detection for the recognition of isolated signs in Brazilian Sign Language (LIBRAS). Although the skeleton-based approach by Alves et al. (2024) enabled substantial improvements in recognition performance, the use of OpenPose for landmark extraction hindered time performance. In a preliminary investigation, we observed that simply replacing OpenPose with the lightweight MediaPipe, while improving processing speed, significantly reduced accuracy. To overcome this limitation, we explored landmark subset selection strategies aimed at optimizing recognition performance. Experimental results showed that a proper landmark subset achieves comparable or superior performance to state-of-the-art methods while reducing processing time by more than 5X compared to Alves et al. (2024). As an additional contribution, we demonstrated that spline-based imputation effectively mitigates missing landmark issues, leading to substantial accuracy gains. These findings highlight that careful landmark selection, combined with simple imputation techniques, enables efficient and accurate isolated sign recognition, paving the way for scalable Sign Language Recognition systems.

Proper Body Landmark Subset Enables More Accurate and 5X Faster Recognition of Isolated Signs in LIBRAS

TL;DR

The paper tackles isolated sign recognition in LIBRAS using a lightweight landmark pipeline based on MediaPipe, addressing the computational bottleneck of OpenPose. It introduces landmark-subset selection strategies and spline-based imputation to maintain accuracy while achieving significant speedups, outperforming a strong prior method in several benchmarks. The ASL-2nd landmark subset, combined with spline imputation and Skeleton-DML encoding, yields robust performance and substantial time efficiency, enabling scalable real-time ISLR. This approach opens avenues for deploying efficient LIBRAS recognition in practical applications and invites extension to other sign languages and continuous SL scenarios.

Abstract

This paper investigates the feasibility of using lightweight body landmark detection for the recognition of isolated signs in Brazilian Sign Language (LIBRAS). Although the skeleton-based approach by Alves et al. (2024) enabled substantial improvements in recognition performance, the use of OpenPose for landmark extraction hindered time performance. In a preliminary investigation, we observed that simply replacing OpenPose with the lightweight MediaPipe, while improving processing speed, significantly reduced accuracy. To overcome this limitation, we explored landmark subset selection strategies aimed at optimizing recognition performance. Experimental results showed that a proper landmark subset achieves comparable or superior performance to state-of-the-art methods while reducing processing time by more than 5X compared to Alves et al. (2024). As an additional contribution, we demonstrated that spline-based imputation effectively mitigates missing landmark issues, leading to substantial accuracy gains. These findings highlight that careful landmark selection, combined with simple imputation techniques, enables efficient and accurate isolated sign recognition, paving the way for scalable Sign Language Recognition systems.

Paper Structure

This paper contains 28 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Pipeline of the proposed landmark-based approach for isolated sign recognition in LIBRAS. Video frames are processed with MediaPipe to extract landmarks, which are selected (subset selection), interpolated (imputation), encoded as 2-D skeleton images, and classified by a CNN to predict the sign label.
  • Figure 2: Landmarks subset selection: the columns show the set of landmarks utilized in each each strategy and the respective count across the body parts.
  • Figure 3: Distribution of video sequence length w.r.t. signs for the evaluation datasets.