Table of Contents
Fetching ...

Offline Deep Model Predictive Control (MPC) for Visual Navigation

Taha Bouzid, Youssef Alj

TL;DR

This work addresses visual navigation using a single RGB camera by learning an offline deep MPC policy that follows a sequence of subgoal images. It introduces ViewNet for future-image prediction conditioned on current view and velocity, and VelocityNet to generate multi-step velocity commands under an MPC objective that minimizes image discrepancy while enforcing smooth velocities, all trained entirely offline in simulation. The approach is evaluated in a ROS Gazebo house environment, with a horizon $N=2$, image-difference threshold $e_m=17$, and velocity bounds $v_{max}=0.5$ m/s and $\omega_{max}=1.0$ rad/s, demonstrating stable tracking across linear, rotational, and combined motions. The results suggest that offline deep MPC can provide accurate and safe visual navigation suitable for embedded platforms, with future work pointing toward obstacle avoidance and real-world transfer using advanced view synthesis techniques such as NeRF.

Abstract

In this paper, we propose a new visual navigation method based on a single RGB perspective camera. Using the Visual Teach & Repeat (VT&R) methodology, the robot acquires a visual trajectory consisting of multiple subgoal images in the teaching step. In the repeat step, we propose two network architectures, namely ViewNet and VelocityNet. The combination of the two networks allows the robot to follow the visual trajectory. ViewNet is trained to generate a future image based on the current view and the velocity command. The generated future image is combined with the subgoal image for training VelocityNet. We develop an offline Model Predictive Control (MPC) policy within VelocityNet with the dual goals of (1) reducing the difference between current and subgoal images and (2) ensuring smooth trajectories by mitigating velocity discontinuities. Offline training conserves computational resources, making it a more suitable option for scenarios with limited computational capabilities, such as embedded systems. We validate our experiments in a simulation environment, demonstrating that our model can effectively minimize the metric error between real and played trajectories.

Offline Deep Model Predictive Control (MPC) for Visual Navigation

TL;DR

This work addresses visual navigation using a single RGB camera by learning an offline deep MPC policy that follows a sequence of subgoal images. It introduces ViewNet for future-image prediction conditioned on current view and velocity, and VelocityNet to generate multi-step velocity commands under an MPC objective that minimizes image discrepancy while enforcing smooth velocities, all trained entirely offline in simulation. The approach is evaluated in a ROS Gazebo house environment, with a horizon , image-difference threshold , and velocity bounds m/s and rad/s, demonstrating stable tracking across linear, rotational, and combined motions. The results suggest that offline deep MPC can provide accurate and safe visual navigation suitable for embedded platforms, with future work pointing toward obstacle avoidance and real-world transfer using advanced view synthesis techniques such as NeRF.

Abstract

In this paper, we propose a new visual navigation method based on a single RGB perspective camera. Using the Visual Teach & Repeat (VT&R) methodology, the robot acquires a visual trajectory consisting of multiple subgoal images in the teaching step. In the repeat step, we propose two network architectures, namely ViewNet and VelocityNet. The combination of the two networks allows the robot to follow the visual trajectory. ViewNet is trained to generate a future image based on the current view and the velocity command. The generated future image is combined with the subgoal image for training VelocityNet. We develop an offline Model Predictive Control (MPC) policy within VelocityNet with the dual goals of (1) reducing the difference between current and subgoal images and (2) ensuring smooth trajectories by mitigating velocity discontinuities. Offline training conserves computational resources, making it a more suitable option for scenarios with limited computational capabilities, such as embedded systems. We validate our experiments in a simulation environment, demonstrating that our model can effectively minimize the metric error between real and played trajectories.
Paper Structure (27 sections, 8 equations, 10 figures, 2 tables, 1 algorithm)

This paper contains 27 sections, 8 equations, 10 figures, 2 tables, 1 algorithm.

Figures (10)

  • Figure 1: Upper view of the house model
  • Figure 2: Images captured from the robot perspective camera, depicting the changes made to the scene by texturing the floor, (a) before texturing, and (b) after texturing.
  • Figure 3: The mobile robot model used in the simulation
  • Figure 4: ViewNet architecture for future image prediction.
  • Figure 5: The proposed path following approach. $I_t$ is the image from the camera attached to the front of the robot at time $t$, $I_i$ is the subgoal image of index $i$ from the visual trajectory, $(v_i,\omega_i)$ are the generated velocities commands to reach the $i$-th subgoal image.
  • ...and 5 more figures