Table of Contents
Fetching ...

RoboFiSense: Attention-Based Robotic Arm Activity Recognition with WiFi Sensing

Rojin Zandi, Kian Behzad, Elaheh Motamedi, Hojjat Salehinejad, Milad Siami

TL;DR

This work addresses indoor robotic arm activity recognition without visual sensors by leveraging WiFi channel state information. It introduces BiVTC, a dual-stream transformer-based model, and RoboFiSense, a public CSI dataset capturing eight Franka Emika arm actions across velocity levels and sniffer positions. BiVTC achieves state-of-the-art accuracy (≈92.5%) and demonstrates robustness to velocity changes, sampling rates, and sniffer placement, while the dataset provides a benchmark for ongoing research in privacy-preserving robotic sensing. The results highlight the practical potential of WiFi-based sensing for safe, non-intrusive robot monitoring in privacy-sensitive indoor environments.

Abstract

Despite the current surge of interest in autonomous robotic systems, robot activity recognition within restricted indoor environments remains a formidable challenge. Conventional methods for detecting and recognizing robotic arms' activities often rely on vision-based or light detection and ranging (LiDAR) sensors, which require line-of-sight (LoS) access and may raise privacy concerns, for example, in nursing facilities. This research pioneers an innovative approach harnessing channel state information (CSI) measured from WiFi signals, subtly influenced by the activity of robotic arms. We developed an attention-based network to classify eight distinct activities performed by a Franka Emika robotic arm in different situations. Our proposed bidirectional vision transformer-concatenated (BiVTC) methodology aspires to predict robotic arm activities accurately, even when trained on activities with different velocities, all without dependency on external or internal sensors or visual aids. Considering the high dependency of CSI data on the environment motivated us to study the problem of sniffer location selection, by systematically changing the sniffer's location and collecting different sets of data. Finally, this paper also marks the first publication of the CSI data of eight distinct robotic arm activities, collectively referred to as RoboFiSense. This initiative aims to provide a benchmark dataset and baselines to the research community, fostering advancements in the field of robotics sensing.

RoboFiSense: Attention-Based Robotic Arm Activity Recognition with WiFi Sensing

TL;DR

This work addresses indoor robotic arm activity recognition without visual sensors by leveraging WiFi channel state information. It introduces BiVTC, a dual-stream transformer-based model, and RoboFiSense, a public CSI dataset capturing eight Franka Emika arm actions across velocity levels and sniffer positions. BiVTC achieves state-of-the-art accuracy (≈92.5%) and demonstrates robustness to velocity changes, sampling rates, and sniffer placement, while the dataset provides a benchmark for ongoing research in privacy-preserving robotic sensing. The results highlight the practical potential of WiFi-based sensing for safe, non-intrusive robot monitoring in privacy-sensitive indoor environments.

Abstract

Despite the current surge of interest in autonomous robotic systems, robot activity recognition within restricted indoor environments remains a formidable challenge. Conventional methods for detecting and recognizing robotic arms' activities often rely on vision-based or light detection and ranging (LiDAR) sensors, which require line-of-sight (LoS) access and may raise privacy concerns, for example, in nursing facilities. This research pioneers an innovative approach harnessing channel state information (CSI) measured from WiFi signals, subtly influenced by the activity of robotic arms. We developed an attention-based network to classify eight distinct activities performed by a Franka Emika robotic arm in different situations. Our proposed bidirectional vision transformer-concatenated (BiVTC) methodology aspires to predict robotic arm activities accurately, even when trained on activities with different velocities, all without dependency on external or internal sensors or visual aids. Considering the high dependency of CSI data on the environment motivated us to study the problem of sniffer location selection, by systematically changing the sniffer's location and collecting different sets of data. Finally, this paper also marks the first publication of the CSI data of eight distinct robotic arm activities, collectively referred to as RoboFiSense. This initiative aims to provide a benchmark dataset and baselines to the research community, fostering advancements in the field of robotics sensing.
Paper Structure (21 sections, 7 equations, 11 figures, 3 tables)

This paper contains 21 sections, 7 equations, 11 figures, 3 tables.

Figures (11)

  • Figure 1: A Franka Emika robot with annotated joints and axis frankaemika.
  • Figure 2: Architecture of the proposed bidirectional vision transformer-concatenated (BiVTC) model. The collected channel state information (CSI) measurements from each sniffer are separately patched, encoded, and fed to the transformer blocks for feature extraction. The feature vectors $\mathbf{f}_1$ and $\mathbf{f}_2$ are concatenated as $\mathbf{f}_c$ and passed as input to a multi-layer perceptron (MLP) network.
  • Figure 3: Floor plan of the data collection environment. The grid area provides an overview of various sniffer placement options.
  • Figure 4: Illustration of the hardware setup and the corresponding schematic map.
  • Figure 5: Illustration of the eight activities performed by the Franka Emika arm in the experiments: (a) Arc, (b) Elbow, (c) Rectangle, (d) Silence, (e) Straight Line - Forward (SLFW), (f) Straight Line - Right Left (SLRL), (g) Straight Line - Up Down (SLUD), and (h) Triangle. The motion patterns of the robotic arm are shown with red dashed lines.
  • ...and 6 more figures