Table of Contents
Fetching ...

On the Evaluation Protocol of Gesture Recognition for UAV-based Rescue Operation based on Deep Learning: A Subject-Independence Perspective

Domonkos Varga

TL;DR

This paper presents a methodological analysis of the gesture-recognition approach proposed by Liu and Szir\'anyi, with a particular focus on the validity of their evaluation protocol, and demonstrates that the evaluation does not measure generalization to unseen individuals.

Abstract

This paper presents a methodological analysis of the gesture-recognition approach proposed by Liu and Szirányi, with a particular focus on the validity of their evaluation protocol. We show that the reported near-perfect accuracy metrics result from a frame-level random train-test split that inevitably mixes samples from the same subjects across both sets, causing severe data leakage. By examining the published confusion matrix, learning curves, and dataset construction, we demonstrate that the evaluation does not measure generalization to unseen individuals. Our findings underscore the importance of subject-independent data partitioning in vision-based gesture-recognition research, especially for applications - such as UAV-human interaction - that require reliable recognition of gestures performed by previously unseen people.

On the Evaluation Protocol of Gesture Recognition for UAV-based Rescue Operation based on Deep Learning: A Subject-Independence Perspective

TL;DR

This paper presents a methodological analysis of the gesture-recognition approach proposed by Liu and Szir\'anyi, with a particular focus on the validity of their evaluation protocol, and demonstrates that the evaluation does not measure generalization to unseen individuals.

Abstract

This paper presents a methodological analysis of the gesture-recognition approach proposed by Liu and Szirányi, with a particular focus on the validity of their evaluation protocol. We show that the reported near-perfect accuracy metrics result from a frame-level random train-test split that inevitably mixes samples from the same subjects across both sets, causing severe data leakage. By examining the published confusion matrix, learning curves, and dataset construction, we demonstrate that the evaluation does not measure generalization to unseen individuals. Our findings underscore the importance of subject-independent data partitioning in vision-based gesture-recognition research, especially for applications - such as UAV-human interaction - that require reliable recognition of gestures performed by previously unseen people.
Paper Structure (18 sections, 10 figures, 1 table)

This paper contains 18 sections, 10 figures, 1 table.

Figures (10)

  • Figure 1: Workflow of the gesture-recognition system implemented and proposed by Liu & Szirányi liu2021gesture.
  • Figure 2: The normalized confusion matrix published in liu2021gesture.
  • Figure 3: The training curves published in liu2021gesture.
  • Figure 4: Prompt submitted to large language models for independent curve analysis.
  • Figure 5: Generated answer of Claude Sonnet 4.5.
  • ...and 5 more figures