Table of Contents
Fetching ...

LeTO: Learning Constrained Visuomotor Policy with Differentiable Trajectory Optimization

Zhengtong Xu, Yu She

TL;DR

LeTO is an approach that integrates trajectory optimization with neural networks to generate actions that not only achieve manipulation tasks, but also comply with constraints, improving the interpretability, safety, and reliability of robot policies acquired through imitation learning, facilitating their deployment in scenarios with high safety requirements.

Abstract

This paper introduces LeTO, a method for learning constrained visuomotor policy with differentiable trajectory optimization. Our approach integrates a differentiable optimization layer into the neural network. By formulating the optimization layer as a trajectory optimization problem, we enable the model to end-to-end generate actions in a safe and constraint-controlled fashion without extra modules. Our method allows for the introduction of constraint information during the training process, thereby balancing the training objectives of satisfying constraints, smoothing the trajectories, and minimizing errors with demonstrations. This ``gray box" method marries optimization-based safety and interpretability with powerful representational abilities of neural networks. We quantitatively evaluate LeTO in simulation and in the real robot. The results demonstrate that LeTO performs well in both simulated and real-world tasks. In addition, it is capable of generating trajectories that are less uncertain, higher quality, and smoother compared to existing imitation learning methods. Therefore, it is shown that LeTO provides a practical example of how to achieve the integration of neural networks with trajectory optimization. We release our code at https://github.com/ZhengtongXu/LeTO.

LeTO: Learning Constrained Visuomotor Policy with Differentiable Trajectory Optimization

TL;DR

LeTO is an approach that integrates trajectory optimization with neural networks to generate actions that not only achieve manipulation tasks, but also comply with constraints, improving the interpretability, safety, and reliability of robot policies acquired through imitation learning, facilitating their deployment in scenarios with high safety requirements.

Abstract

This paper introduces LeTO, a method for learning constrained visuomotor policy with differentiable trajectory optimization. Our approach integrates a differentiable optimization layer into the neural network. By formulating the optimization layer as a trajectory optimization problem, we enable the model to end-to-end generate actions in a safe and constraint-controlled fashion without extra modules. Our method allows for the introduction of constraint information during the training process, thereby balancing the training objectives of satisfying constraints, smoothing the trajectories, and minimizing errors with demonstrations. This ``gray box" method marries optimization-based safety and interpretability with powerful representational abilities of neural networks. We quantitatively evaluate LeTO in simulation and in the real robot. The results demonstrate that LeTO performs well in both simulated and real-world tasks. In addition, it is capable of generating trajectories that are less uncertain, higher quality, and smoother compared to existing imitation learning methods. Therefore, it is shown that LeTO provides a practical example of how to achieve the integration of neural networks with trajectory optimization. We release our code at https://github.com/ZhengtongXu/LeTO.
Paper Structure (23 sections, 12 equations, 7 figures, 11 tables)

This paper contains 23 sections, 12 equations, 7 figures, 11 tables.

Figures (7)

  • Figure 1: In LeTO, we enable the model to end-to-end generate actions in a safe and constraint-controlled fashion without extra modules. To the best of our knowledge, LeTO is the first visuomotor imitation learning framework that not only utilizes differentiable optimization but also demonstrates its efficacy in real-world robotic manipulation tasks.
  • Figure 2: Overview of LeTO. We enable the model to end-to-end generate actions in a safe and constraint-controlled fashion by integrating a differentiable trajectory optimization layer.
  • Figure 3: Illustration of training data sampling.
  • Figure 4: Simulation benchmarks: pick and place the can (can task) and grasping and assembling the square nut (square task).
  • Figure 5: Move-the-stack task. The robot grasps a set of stacked objects, smoothly transport them, and ultimately place them onto a black board, ensuring none of the stacked objects fall off. The black tape is for marking a consistent grasping position.
  • ...and 2 more figures

Theorems & Definitions (2)

  • Remark 1: feasibility of the differentiable optimization layer
  • Remark 2: representational power