Table of Contents
Fetching ...

Real-Time Multimodal Signal Processing for HRI in RoboCup: Understanding a Human Referee

Filippo Ansalone, Flavio Maiorana, Daniele Affinita, Flavio Volpi, Eugenio Bugli, Francesco Petri, Michele Brienza, Valerio Spagnoli, Vincenzo Suriani, Daniele Nardi, Domenico D. Bloisi

TL;DR

Using the NAO robot platform, this study implements a two-stage pipeline for gesture recognition through keypoint extraction and classification, alongside continuous convolutional neural networks (CCNNs) for efficient whistle detection.

Abstract

Advancing human-robot communication is crucial for autonomous systems operating in dynamic environments, where accurate real-time interpretation of human signals is essential. RoboCup provides a compelling scenario for testing these capabilities, requiring robots to understand referee gestures and whistle with minimal network reliance. Using the NAO robot platform, this study implements a two-stage pipeline for gesture recognition through keypoint extraction and classification, alongside continuous convolutional neural networks (CCNNs) for efficient whistle detection. The proposed approach enhances real-time human-robot interaction in a competitive setting like RoboCup, offering some tools to advance the development of autonomous systems capable of cooperating with humans.

Real-Time Multimodal Signal Processing for HRI in RoboCup: Understanding a Human Referee

TL;DR

Using the NAO robot platform, this study implements a two-stage pipeline for gesture recognition through keypoint extraction and classification, alongside continuous convolutional neural networks (CCNNs) for efficient whistle detection.

Abstract

Advancing human-robot communication is crucial for autonomous systems operating in dynamic environments, where accurate real-time interpretation of human signals is essential. RoboCup provides a compelling scenario for testing these capabilities, requiring robots to understand referee gestures and whistle with minimal network reliance. Using the NAO robot platform, this study implements a two-stage pipeline for gesture recognition through keypoint extraction and classification, alongside continuous convolutional neural networks (CCNNs) for efficient whistle detection. The proposed approach enhances real-time human-robot interaction in a competitive setting like RoboCup, offering some tools to advance the development of autonomous systems capable of cooperating with humans.

Paper Structure

This paper contains 11 sections, 1 equation, 2 figures, 1 table.

Figures (2)

  • Figure 1: Overview of the Robocup SPL field during the standby phase (left) and referee gesture detection from the robot's perspective (right). The right image highlights the region of interest (ROI) and displays the skeleton.
  • Figure 2: Triggered states during a game, for each robot in the field. It is important to highlight the integration of the modules that process referee's signals within the pipeline: a certain number of robots have to recognize a specific referee's signal (gesture for 4 consecutive camera frames or whistle) to move instantaneously to the following state, bypassing the delay associated to the message from Game Controller.