Table of Contents
Fetching ...

Sampling-Based Constrained Motion Planning with Products of Experts

Amirreza Razmjoo, Teng Xue, Suhan Shetty, Sylvain Calinon

TL;DR

The paper tackles constrained motion planning under complex constraints by reframing SMPC as a product of experts over two distributions: an optimality distribution $p^o$ and a feasibility distribution $p^f$. It introduces a tensor-train (TT) based TT-PoE-MPPI method that projects the optimality distribution into the feasible set before sampling, enabling efficient, gradient-free optimization in challenging environments. Across obstacle avoidance, non-prehensile manipulation, and restricted-volume tracking, TT-PoE-MPPI and its TT-based feasibility variants consistently outperform MPPI, Proj-MPPI, and NF-based PoE approaches, especially at low-to-medium sampling budgets. The work discusses TT advantages, scalability limitations, and potential integrations with diffusion-based samplers, offering a practical, modular framework for high-uncertainty robotic planning with constrained actions.

Abstract

We present a novel approach to enhance the performance of sampling-based Model Predictive Control (MPC) in constrained optimization by leveraging products of experts. Our methodology divides the main problem into two components: one focused on optimality and the other on feasibility. By combining the solutions from each component, represented as distributions, we apply products of experts to implement a project-then-sample strategy. In this strategy, the optimality distribution is projected into the feasible area, allowing for more efficient sampling. This approach contrasts with the traditional sample-then-project and naive sample-then-reject method, leading to more diverse exploration and reducing the accumulation of samples on the boundaries. We demonstrate an effective implementation of this principle using a tensor train-based distribution model, which is characterized by its non-parametric nature, ease of combination with other distributions at the task level, and straightforward sampling technique. We adapt existing tensor train models to suit this purpose and validate the efficacy of our approach through experiments in various tasks, including obstacle avoidance, non-prehensile manipulation, and tasks involving staying in a restricted volume. Our experimental results demonstrate that the proposed method consistently outperforms known baselines, providing strong empirical support for its effectiveness. Sample codes for this project are available at https://github.com/idiap/smpc_poe.

Sampling-Based Constrained Motion Planning with Products of Experts

TL;DR

The paper tackles constrained motion planning under complex constraints by reframing SMPC as a product of experts over two distributions: an optimality distribution and a feasibility distribution . It introduces a tensor-train (TT) based TT-PoE-MPPI method that projects the optimality distribution into the feasible set before sampling, enabling efficient, gradient-free optimization in challenging environments. Across obstacle avoidance, non-prehensile manipulation, and restricted-volume tracking, TT-PoE-MPPI and its TT-based feasibility variants consistently outperform MPPI, Proj-MPPI, and NF-based PoE approaches, especially at low-to-medium sampling budgets. The work discusses TT advantages, scalability limitations, and potential integrations with diffusion-based samplers, offering a practical, modular framework for high-uncertainty robotic planning with constrained actions.

Abstract

We present a novel approach to enhance the performance of sampling-based Model Predictive Control (MPC) in constrained optimization by leveraging products of experts. Our methodology divides the main problem into two components: one focused on optimality and the other on feasibility. By combining the solutions from each component, represented as distributions, we apply products of experts to implement a project-then-sample strategy. In this strategy, the optimality distribution is projected into the feasible area, allowing for more efficient sampling. This approach contrasts with the traditional sample-then-project and naive sample-then-reject method, leading to more diverse exploration and reducing the accumulation of samples on the boundaries. We demonstrate an effective implementation of this principle using a tensor train-based distribution model, which is characterized by its non-parametric nature, ease of combination with other distributions at the task level, and straightforward sampling technique. We adapt existing tensor train models to suit this purpose and validate the efficacy of our approach through experiments in various tasks, including obstacle avoidance, non-prehensile manipulation, and tasks involving staying in a restricted volume. Our experimental results demonstrate that the proposed method consistently outperforms known baselines, providing strong empirical support for its effectiveness. Sample codes for this project are available at https://github.com/idiap/smpc_poe.

Paper Structure

This paper contains 36 sections, 53 equations, 11 figures, 8 tables, 2 algorithms.

Figures (11)

  • Figure 1: In sampling-based Model Predictive Control (SMPC), many samples can be discarded due to inefficiencies or infeasibility. In (a), only the green samples effectively influence the movement of the object, while the actions indicated by the red arrows have no impact on the environment. (b) illustrates the importance of selecting samples that do not collide with obstacles.
  • Figure 2: An illustrative example highlighting the difference between two sampling strategies from a uniform distribution that meets the task constraints outlined in Fig. \ref{['fig:whole_pipeline']}: (a) sample-then-project and (b) project-then-sample. In (a), many samples cluster on the boundaries, leaving significant areas within the feasible set unexplored, whereas (b) promotes a more thorough exploration. Note that the difference between the two approaches would become more pronounced if the feasible space were narrower.
  • Figure 3: An illustrative example of using products of experts. The agent must navigate from the red flag to the green flag while remaining in the white area and avoiding the blue one. The problem is divided into two parts: one models the feasibility distribution, and the other models the optimality distribution. These are combined to project the optimality distribution into the feasible area (Project first), from which trajectories are then sampled (Then sample).
  • Figure 4: TT decomposition extends matrix decomposition techniques to higher-dimensional arrays. In the TT format, accessing an element of a tensor involves multiplying the chosen slices (represented by green-colored matrices) of the core tensors (factors). The illustration provides examples for a 2nd order, 3rd order, and a 4th order tensor. This picture is adopted from Shetty23 with permission from the authors.
  • Figure 5: The pipeline employed in this paper integrates two distributions. Red, and blue colors indicate the cores related to the state and action variables, respectively. Initially, the optimality distribution is transformed into TT-cores, followed by elementwise multiplication with the feasibility distribution cores corresponding to the action space. Subsequently, a new distribution is generated using the updated cores. In the illustration, white boxes represent elements equal to 1, while darker boxes denote elements equal to 0. It is important to note that this visualization is simplified for clarity: in actual scenarios, the cores would not strictly be 0 or 1.
  • ...and 6 more figures