Table of Contents
Fetching ...

Towards Real-Time Generation of Delay-Compensated Video Feeds for Outdoor Mobile Robot Teleoperation

Neeloy Chakraborty, Yixiao Fang, Andre Schreiber, Tianchen Ji, Zhe Huang, Aganze Mihigo, Cassidy Wall, Abdulrahman Almana, Katherine Driggs-Campbell

TL;DR

The paper addresses latency in teleoperation video feeds for outdoor agricultural robots by introducing a modular, real-time framework that generates delay-compensated views using monocular depth estimation, a depth-based 3D scene representation rendered with a sphere-based Pulsar renderer, future pose prediction, and an inpainting network to fill disoccluded regions. Depth foundation models are finetuned on TerraSentia field data to support accurate 3D reprojections, while a simple motion model forecasts future poses for rendering. Offline evaluations on under-canopy crop data show superior image fidelity compared to baselines, and a real-time ROS deployment demonstrates practical viability in outdoor conditions, albeit with sensitivity to odometry and kinematics accuracy. Collectively, the work advances robust, real-time delay compensation for outdoor robot teleoperation by integrating depth-based rendering, neural refinement, and ROS-enabled deployment for agricultural applications.

Abstract

Teleoperation is an important technology to enable supervisors to control agricultural robots remotely. However, environmental factors in dense crop rows and limitations in network infrastructure hinder the reliability of data streamed to teleoperators. These issues result in delayed and variable frame rate video feeds that often deviate significantly from the robot's actual viewpoint. We propose a modular learning-based vision pipeline to generate delay-compensated images in real-time for supervisors. Our extensive offline evaluations demonstrate that our method generates more accurate images compared to state-of-the-art approaches in our setting. Additionally, ours is one of the few works to evaluate a delay-compensation method in outdoor field environments with complex terrain on data from a real robot in real-time. Resulting videos and code are provided at https://sites.google.com/illinois.edu/comp-teleop.

Towards Real-Time Generation of Delay-Compensated Video Feeds for Outdoor Mobile Robot Teleoperation

TL;DR

The paper addresses latency in teleoperation video feeds for outdoor agricultural robots by introducing a modular, real-time framework that generates delay-compensated views using monocular depth estimation, a depth-based 3D scene representation rendered with a sphere-based Pulsar renderer, future pose prediction, and an inpainting network to fill disoccluded regions. Depth foundation models are finetuned on TerraSentia field data to support accurate 3D reprojections, while a simple motion model forecasts future poses for rendering. Offline evaluations on under-canopy crop data show superior image fidelity compared to baselines, and a real-time ROS deployment demonstrates practical viability in outdoor conditions, albeit with sensitivity to odometry and kinematics accuracy. Collectively, the work advances robust, real-time delay compensation for outdoor robot teleoperation by integrating depth-based rendering, neural refinement, and ROS-enabled deployment for agricultural applications.

Abstract

Teleoperation is an important technology to enable supervisors to control agricultural robots remotely. However, environmental factors in dense crop rows and limitations in network infrastructure hinder the reliability of data streamed to teleoperators. These issues result in delayed and variable frame rate video feeds that often deviate significantly from the robot's actual viewpoint. We propose a modular learning-based vision pipeline to generate delay-compensated images in real-time for supervisors. Our extensive offline evaluations demonstrate that our method generates more accurate images compared to state-of-the-art approaches in our setting. Additionally, ours is one of the few works to evaluate a delay-compensation method in outdoor field environments with complex terrain on data from a real robot in real-time. Resulting videos and code are provided at https://sites.google.com/illinois.edu/comp-teleop.
Paper Structure (17 sections, 6 equations, 4 figures, 3 tables)

This paper contains 17 sections, 6 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: (a) The TerraSentia+ robot in dense growth and (b) an example of a remote teleoperation setup.
  • Figure 2: Block diagram of ROS pipeline. The robot sends sensor messages (green) to our node. Functions are required to wait for mutex locks when accessing or modifying global data (blue). The renderer generates images that are 30 Hz apart to enable a consistent FPS display.
  • Figure 3: A comparison of depth model estimates and resulting Pulsar renderings with ground truth (GT). Holes in predictions are drawn green.
  • Figure 4: Examples of ResNet inpainting model predictions.