Table of Contents
Fetching ...

Data-Driven Distributionally Robust Optimal Control with State-Dependent Noise

Rui Liu, Guangyao Shi, Pratap Tokekar

TL;DR

Data-Driven Distributionally Robust Optimal Control ($\mathrm{D}^3\mathrm{ROC}$) addresses the problem of unknown disturbance distributions by jointly learning the reference distribution $q$ and the KL divergence bound $d$ from data. It combines Gaussian Process-based noise modeling for state-dependent references with a $k$-NN KL-divergence estimator, integrated into a Differential Dynamic Programming (DDP) inner loop and a cross-entropy outer loop. The approach is validated on a car-like robot navigation task, where $\mathrm{D}^3\mathrm{ROC}$ demonstrates robust performance and risk-averse behavior, outperforming iLQG across multiple noise scenarios. The results indicate significant practical benefits for real-world systems with uncertain or non-stationary disturbances, enabling robust control without pre-specified ambiguity sets.

Abstract

Distributionally Robust Optimal Control (DROC) is a framework that enables robust control in a stochastic setting where the true disturbance distribution is unknown. Traditional DROC approaches require given ambiguity sets and KL divergence bounds to represent the distributional uncertainty; however, these quantities are often unavailable a priori or require manual specification. To overcome this limitation, we propose a data-driven approach that jointly estimates the uncertainty distribution and the corresponding KL divergence bound, which we refer to as $\mathrm{D}^3\mathrm{ROC}$. To evaluate the effectiveness of our approach, we consider a car-like robot navigation task with unknown noise distributions. The experimental results show that $\mathrm{D}^3\mathrm{ROC}$ yields robust and effective control policies, outperforming iterative Linear Quadratic Gaussian (iLQG) control and demonstrating strong adaptability to varying noise distributions.

Data-Driven Distributionally Robust Optimal Control with State-Dependent Noise

TL;DR

Data-Driven Distributionally Robust Optimal Control () addresses the problem of unknown disturbance distributions by jointly learning the reference distribution and the KL divergence bound from data. It combines Gaussian Process-based noise modeling for state-dependent references with a -NN KL-divergence estimator, integrated into a Differential Dynamic Programming (DDP) inner loop and a cross-entropy outer loop. The approach is validated on a car-like robot navigation task, where demonstrates robust performance and risk-averse behavior, outperforming iLQG across multiple noise scenarios. The results indicate significant practical benefits for real-world systems with uncertain or non-stationary disturbances, enabling robust control without pre-specified ambiguity sets.

Abstract

Distributionally Robust Optimal Control (DROC) is a framework that enables robust control in a stochastic setting where the true disturbance distribution is unknown. Traditional DROC approaches require given ambiguity sets and KL divergence bounds to represent the distributional uncertainty; however, these quantities are often unavailable a priori or require manual specification. To overcome this limitation, we propose a data-driven approach that jointly estimates the uncertainty distribution and the corresponding KL divergence bound, which we refer to as . To evaluate the effectiveness of our approach, we consider a car-like robot navigation task with unknown noise distributions. The experimental results show that yields robust and effective control policies, outperforming iterative Linear Quadratic Gaussian (iLQG) control and demonstrating strong adaptability to varying noise distributions.
Paper Structure (19 sections, 20 equations, 3 figures, 1 table, 2 algorithms)

This paper contains 19 sections, 20 equations, 3 figures, 1 table, 2 algorithms.

Figures (3)

  • Figure 1: The heatmap of variances of each component of $p^{(b)}$ for the $x$ and $y$ coordinates.
  • Figure 2: Visualization of the GP used to predict variances for $p^{(b)}$, based on $100$ data points, with black dots representing the variances calculated using MLE from observed data, the line depicting the fitted mean, and the shaded area indicating the $95\%$ confidence interval.
  • Figure 3: Comparison of a car-like robot navigating to the origin under unknown true noise $p^{(b)}$ with $\mathrm{D}^3\mathrm{ROC}$ and iLQG. The starting position is $(5.0, 5.0)$ and the goal position is $(0.0,0.0)$. The results of multiple runs are shown in the figure, where the red lines and blue lines represent different paths of the robot under iLQG and $\mathrm{D}^3\mathrm{ROC}$, respectively.