Table of Contents
Fetching ...

SOMTP: Self-Supervised Learning-Based Optimizer for MPC-Based Safe Trajectory Planning Problems in Robotics

Yifan Liu, You Wang, Guang Li

TL;DR

SOMTP addresses the challenge of solving non-convex COPs in MPC-based trajectory planning with Control Barrier Functions by combining problem transcription, a differentiable SLPG correction, and an Augmented Lagrangian–based training regime that integrates guide policy constraints. The approach yields high feasibility and near-optimal solutions at a fraction of the time required by traditional optimizers, outperforming several learning-based baselines in both static test scenarios and continuous robot navigation. The key innovations include moving beyond convex projections through differentiable corrections, jointly updating Lagrange multipliers during training, and guiding the optimizer toward safe regions with a learned policy. This has practical implications for real-time, safe robotic motion planning, with potential extensions to other COPs and safe RL frameworks.

Abstract

Model Predictive Control (MPC)-based trajectory planning has been widely used in robotics, and incorporating Control Barrier Function (CBF) constraints into MPC can greatly improve its obstacle avoidance efficiency. Unfortunately, traditional optimizers are resource-consuming and slow to solve such non-convex constrained optimization problems (COPs) while learning-based methods struggle to satisfy the non-convex constraints. In this paper, we propose SOMTP algorithm, a self-supervised learning-based optimizer for CBF-MPC trajectory planning. Specifically, first, SOMTP employs problem transcription to satisfy most of the constraints. Then the differentiable SLPG correction is proposed to move the solution closer to the safe set and is then converted as the guide policy in the following training process. After that, inspired by the Augmented Lagrangian Method (ALM), our training algorithm integrated with guide policy constraints is proposed to enable the optimizer network to converge to a feasible solution. Finally, experiments show that the proposed algorithm has better feasibility than other learning-based methods and can provide solutions much faster than traditional optimizers with similar optimality.

SOMTP: Self-Supervised Learning-Based Optimizer for MPC-Based Safe Trajectory Planning Problems in Robotics

TL;DR

SOMTP addresses the challenge of solving non-convex COPs in MPC-based trajectory planning with Control Barrier Functions by combining problem transcription, a differentiable SLPG correction, and an Augmented Lagrangian–based training regime that integrates guide policy constraints. The approach yields high feasibility and near-optimal solutions at a fraction of the time required by traditional optimizers, outperforming several learning-based baselines in both static test scenarios and continuous robot navigation. The key innovations include moving beyond convex projections through differentiable corrections, jointly updating Lagrange multipliers during training, and guiding the optimizer toward safe regions with a learned policy. This has practical implications for real-time, safe robotic motion planning, with potential extensions to other COPs and safe RL frameworks.

Abstract

Model Predictive Control (MPC)-based trajectory planning has been widely used in robotics, and incorporating Control Barrier Function (CBF) constraints into MPC can greatly improve its obstacle avoidance efficiency. Unfortunately, traditional optimizers are resource-consuming and slow to solve such non-convex constrained optimization problems (COPs) while learning-based methods struggle to satisfy the non-convex constraints. In this paper, we propose SOMTP algorithm, a self-supervised learning-based optimizer for CBF-MPC trajectory planning. Specifically, first, SOMTP employs problem transcription to satisfy most of the constraints. Then the differentiable SLPG correction is proposed to move the solution closer to the safe set and is then converted as the guide policy in the following training process. After that, inspired by the Augmented Lagrangian Method (ALM), our training algorithm integrated with guide policy constraints is proposed to enable the optimizer network to converge to a feasible solution. Finally, experiments show that the proposed algorithm has better feasibility than other learning-based methods and can provide solutions much faster than traditional optimizers with similar optimality.
Paper Structure (13 sections, 26 equations, 7 figures, 2 algorithms)

This paper contains 13 sections, 26 equations, 7 figures, 2 algorithms.

Figures (7)

  • Figure 1: Frames of the robotic system.
  • Figure 2: The structure of SOMTP algorithm.
  • Figure 3: The structure of the network. Each CO-Layer in the network has the structure in the upper right corner of the figure.
  • Figure 4: Results on test dataset (with 50000 instances).
  • Figure 5: Continues trajectory planning tasks on robot. Each grid is 1 m in width. The target area is denoted by a red circle. Green arrays represent the initial states, while red arrays represent the target states.
  • ...and 2 more figures