Table of Contents
Fetching ...

NeRFoot: Robot-Footprint Estimation for Image-Based Visual Servoing

Daoxin Zhong, Luke Robinson, Daniele De Martini

Abstract

This paper investigates the utility of Neural Radiance Fields (NeRF) models in extending the regions of operation of a mobile robot, controlled by Image-Based Visual Servoing (IBVS) via static CCTV cameras. Using NeRF as a 3D-representation prior, the robot's footprint may be extrapolated geometrically and used to train a CNN-based network to extract it online from the robot's appearance alone. The resulting footprint results in a tighter bound than a robot-wide bounding box, allowing the robot's controller to prescribe more optimal trajectories and expand its safe operational floor area.

NeRFoot: Robot-Footprint Estimation for Image-Based Visual Servoing

Abstract

This paper investigates the utility of Neural Radiance Fields (NeRF) models in extending the regions of operation of a mobile robot, controlled by Image-Based Visual Servoing (IBVS) via static CCTV cameras. Using NeRF as a 3D-representation prior, the robot's footprint may be extrapolated geometrically and used to train a CNN-based network to extract it online from the robot's appearance alone. The resulting footprint results in a tighter bound than a robot-wide bounding box, allowing the robot's controller to prescribe more optimal trajectories and expand its safe operational floor area.
Paper Structure (8 sections, 2 figures, 2 tables)

This paper contains 8 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: robinson2023robot controls the robot based on its bounding box (yellow) and orientation (green). When checking if a trajectory is safe, its box must stay within the drivable region (blue). However, this precludes a huge area, which is still safe but intersects with the wall (red) in the image plane.
  • Figure 2: System diagram of the footprint-estimation training process via the projection ray transform method (top) and excerpt of the footprint (bottom) as from the CAD ground truth (a), YOLO trained from the real-world ground truth (b) and from the synthetic ground truth (c).