Table of Contents
Fetching ...

Multi-LED Classification as Pretext For Robot Heading Estimation

Nicholas Carlotti, Mirko Nava, Alessandro Giusti

TL;DR

The paper tackles vision-based relative robot localization and heading estimation with limited labeling by formulating a self-supervised pretext task: predict the ON/OFF states of four LEDs mounted on each robot from monocular RGB input. An FCN outputs per-pixel maps for robot presence, heading components, and LED states, deriving the position from the peak of the presence map and the heading from a heading map weighted by the robot's location. Training uses a LED-state loss weighted by the robot projection and LED visibility, enabling detection and heading estimation without pose labels. Results show a median position error of $14.5$ px and a median heading error of $17.0$ deg, close to a supervised upperbound of $10.1$ px and $8.4$ deg on the visible subset, demonstrating practical, low-label learning for multi-robot scenarios.

Abstract

We propose a self-supervised approach for visual robot detection and heading estimation by learning to estimate the states (OFF or ON) of four independent robot-mounted LEDs. Experimental results show a median image-space position error of 14 px and relative heading MAE of 17 degrees, versus a supervised upperbound scoring 10 px and 8 degrees, respectively.

Multi-LED Classification as Pretext For Robot Heading Estimation

TL;DR

The paper tackles vision-based relative robot localization and heading estimation with limited labeling by formulating a self-supervised pretext task: predict the ON/OFF states of four LEDs mounted on each robot from monocular RGB input. An FCN outputs per-pixel maps for robot presence, heading components, and LED states, deriving the position from the peak of the presence map and the heading from a heading map weighted by the robot's location. Training uses a LED-state loss weighted by the robot projection and LED visibility, enabling detection and heading estimation without pose labels. Results show a median position error of px and a median heading error of deg, close to a supervised upperbound of px and deg on the visible subset, demonstrating practical, low-label learning for multi-robot scenarios.

Abstract

We propose a self-supervised approach for visual robot detection and heading estimation by learning to estimate the states (OFF or ON) of four independent robot-mounted LEDs. Experimental results show a median image-space position error of 14 px and relative heading MAE of 17 degrees, versus a supervised upperbound scoring 10 px and 8 degrees, respectively.
Paper Structure (5 sections, 1 figure, 1 table)

This paper contains 5 sections, 1 figure, 1 table.

Figures (1)

  • Figure 1: We learn robot detection and heading estimation (a) by optimizing the LED loss $\mathcal{L}$ (b) weighted by position $\bm{\hat{P}}$ and orientation $\bm{\hat{\Psi}}$ belief maps (c).