Table of Contents
Fetching ...

Sample-Efficient and Smooth Cross-Entropy Method Model Predictive Control Using Deterministic Samples

Markus Walker, Daniel Frisch, Uwe D. Hanebeck

TL;DR

dsCEM is proposed, a novel framework that replaces the random sampling step with deterministic samples derived from localized cumulative distributions (LCDs) and introduces modular schemes to generate and adapt these sample sets, incorporating temporal correlations to ensure smooth control trajectories.

Abstract

Cross-entropy method model predictive control (CEM--MPC) is a powerful gradient-free technique for nonlinear optimal control, but its performance is often limited by the reliance on random sampling. This conventional approach can lead to inefficient exploration of the solution space and non-smooth control inputs, requiring a large number of samples to achieve satisfactory results. To address these limitations, we propose deterministic sampling CEM (dsCEM), a novel framework that replaces the random sampling step with deterministic samples derived from localized cumulative distributions (LCDs). Our approach introduces modular schemes to generate and adapt these sample sets, incorporating temporal correlations to ensure smooth control trajectories. This method can be used as a drop-in replacement for the sampling step in existing CEM-based controllers. Experimental evaluations on two nonlinear control tasks demonstrate that dsCEM consistently outperforms state-of-the-art iCEM in terms of cumulative cost and control input smoothness, particularly in the critical low-sample regime.

Sample-Efficient and Smooth Cross-Entropy Method Model Predictive Control Using Deterministic Samples

TL;DR

dsCEM is proposed, a novel framework that replaces the random sampling step with deterministic samples derived from localized cumulative distributions (LCDs) and introduces modular schemes to generate and adapt these sample sets, incorporating temporal correlations to ensure smooth control trajectories.

Abstract

Cross-entropy method model predictive control (CEM--MPC) is a powerful gradient-free technique for nonlinear optimal control, but its performance is often limited by the reliance on random sampling. This conventional approach can lead to inefficient exploration of the solution space and non-smooth control inputs, requiring a large number of samples to achieve satisfactory results. To address these limitations, we propose deterministic sampling CEM (dsCEM), a novel framework that replaces the random sampling step with deterministic samples derived from localized cumulative distributions (LCDs). Our approach introduces modular schemes to generate and adapt these sample sets, incorporating temporal correlations to ensure smooth control trajectories. This method can be used as a drop-in replacement for the sampling step in existing CEM-based controllers. Experimental evaluations on two nonlinear control tasks demonstrate that dsCEM consistently outperforms state-of-the-art iCEM in terms of cumulative cost and control input smoothness, particularly in the critical low-sample regime.

Paper Structure

This paper contains 18 sections, 20 equations, 4 figures, 1 table, 2 algorithms.

Figures (4)

  • Figure 1: Schematic showing control input sampling over a finite horizon using either deterministic or random samples. As can be seen, deterministic samples cover the stochastic process without large gaps or clusters. For simplicity, time correlations are neglected.
  • Figure 2: Example of 25.0 two-dimensional deterministic samples, where the background color indicates the PDF.
  • Figure 3: The results for the Mountain Car Task are given for \ref{['fig:mountain_car_results:cum_cost']} cumulative cost and \ref{['fig:mountain_car_results:action_smoothness']} control input smoothness over sample size, as well as for \ref{['fig:mountain_car_results:convergence']} convergence behavior for a fixed sample size of 50.0. All plots show the median (line) and the interquartile range (shaded area) across 100.0 runs. The different methods' colors are consistent across all plots.
  • Figure 4: Results for the cart-pole task. The different methods' colors are consistent across all plots.