An Unsupervised C-Uniform Trajectory Sampler with Applications to Model Predictive Path Integral Control
O. Goktug Poyrazoglu, Rahul Moorthy, Yukang Cao, William Chastek, Volkan Isler
TL;DR
The paper tackles limited exploration in sampling-based MPC by introducing Neural C-Uniform, an unsupervised learner that maps states to control-input probabilities to uniformly cover the configuration space without discretization. It then couples this sampler with MPPI in CU-MPPI, using Neural C-Uniform trajectories to select a strong nominal and to guide stochastic sampling, with results showing improved performance on high-curvature paths and longer horizons. Key contributions include the entropy-maximization formulation for Neural C-Uniform, its neural architecture, and the CU-MPPI framework, validated across simulation and real-world cluttered and dynamic environments. The approach offers a scalable, uniformly exploratory alternative to gradient-reliant refinements, with practical impact for robust navigation in complex, changing settings.
Abstract
Sampling-based model predictive controllers generate trajectories by sampling control inputs from a fixed, simple distribution such as the normal or uniform distributions. This sampling method yields trajectory samples that are tightly clustered around a mean trajectory. This clustering behavior in turn, limits the exploration capability of the controller and reduces the likelihood of finding feasible solutions in complex environments. Recent work has attempted to address this problem by either reshaping the resulting trajectory distribution or increasing the sample entropy to enhance diversity and promote exploration. In our recent work, we introduced the concept of C-Uniform trajectory generation [1] which allows the computation of control input probabilities to generate trajectories that sample the configuration space uniformly. In this work, we first address the main limitation of this method: lack of scalability due to computational complexity. We introduce Neural C-Uniform, an unsupervised C-Uniform trajectory sampler that mitigates scalability issues by computing control input probabilities without relying on a discretized configuration space. Experiments show that Neural C-Uniform achieves a similar uniformity ratio to the original C-Uniform approach and generates trajectories over a longer time horizon while preserving uniformity. Next, we present CU-MPPI, which integrates Neural C-Uniform sampling into existing MPPI variants. We analyze the performance of CU-MPPI in simulation and real-world experiments. Our results indicate that in settings where the optimal solution has high curvature, CU-MPPI leads to drastic improvements in performance.
