Table of Contents
Fetching ...

Dirichlet Logistic Gaussian Processes for Evaluation of Black-Box Stochastic Systems under Complex Requirements

Ryohei Oura, Yuji Ito

TL;DR

This work addresses distributional evaluation of robustness for black-box CPS under complex requirements when data are scarce. It introduces Dirichlet Logistic Gaussian Process (DLGP), a semiparametric model that places a Dirichlet random field over discretized robustness levels, with the Dirichlet parameter function alpha_post(x) expressed as a sum of a data-driven pseudo-count and a prior, where the pseudo-counts are produced by multiple Logistic Gaussian Processes modeling level-specific input densities. A conservativeness parameter lambda moderates how much uncertainty from the LGPs enters the pseudo-counts, yielding conservative, data-aware estimates and quantified confidence. The approach is shown to be consistent as data grow, and empirical results on a robot path-planning example demonstrate that DLGP outperforms KDE- and GDP-based methods in accuracy and calibrated uncertainty, enabling safer and more reliable evaluation of CPS under small data regimes.

Abstract

The requirement-driven performance evaluation of a black-box cyber-physical system (CPS) that utilizes machine learning methods has proven to be an effective way to assess the quality of the CPS. However, the distributional evaluation of the performance has been poorly considered. Although many uncertainty estimation methods have been advocated, they have not successfully estimated highly complex performance distributions under small data. In this paper, we propose a method to distributionally evaluate the performance under complex requirements using small input-trajectory data. To handle the unknown complex probability distributions under small data, we discretize the corresponding performance measure, yielding a discrete random process over an input region. Then, we propose a semiparametric Bayesian model of the discrete process based on a Dirichlet random field whose parameter function is represented by multiple logistic Gaussian processes (LGPs). The Dirichlet posterior parameter function is estimated through the LGP posteriors in a reasonable and conservative fashion. We show that the proposed Bayesian model converges to the true discrete random process as the number of data becomes large enough. We also empirically demonstrate the effectiveness of the proposed method by simulation.

Dirichlet Logistic Gaussian Processes for Evaluation of Black-Box Stochastic Systems under Complex Requirements

TL;DR

This work addresses distributional evaluation of robustness for black-box CPS under complex requirements when data are scarce. It introduces Dirichlet Logistic Gaussian Process (DLGP), a semiparametric model that places a Dirichlet random field over discretized robustness levels, with the Dirichlet parameter function alpha_post(x) expressed as a sum of a data-driven pseudo-count and a prior, where the pseudo-counts are produced by multiple Logistic Gaussian Processes modeling level-specific input densities. A conservativeness parameter lambda moderates how much uncertainty from the LGPs enters the pseudo-counts, yielding conservative, data-aware estimates and quantified confidence. The approach is shown to be consistent as data grow, and empirical results on a robot path-planning example demonstrate that DLGP outperforms KDE- and GDP-based methods in accuracy and calibrated uncertainty, enabling safer and more reliable evaluation of CPS under small data regimes.

Abstract

The requirement-driven performance evaluation of a black-box cyber-physical system (CPS) that utilizes machine learning methods has proven to be an effective way to assess the quality of the CPS. However, the distributional evaluation of the performance has been poorly considered. Although many uncertainty estimation methods have been advocated, they have not successfully estimated highly complex performance distributions under small data. In this paper, we propose a method to distributionally evaluate the performance under complex requirements using small input-trajectory data. To handle the unknown complex probability distributions under small data, we discretize the corresponding performance measure, yielding a discrete random process over an input region. Then, we propose a semiparametric Bayesian model of the discrete process based on a Dirichlet random field whose parameter function is represented by multiple logistic Gaussian processes (LGPs). The Dirichlet posterior parameter function is estimated through the LGP posteriors in a reasonable and conservative fashion. We show that the proposed Bayesian model converges to the true discrete random process as the number of data becomes large enough. We also empirically demonstrate the effectiveness of the proposed method by simulation.
Paper Structure (11 sections, 2 theorems, 23 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 11 sections, 2 theorems, 23 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Consider $m$ LGP priors with the covariance kernels $\{ g_l \}_{l=1}^m$, input-trajectory data set $X=\{x_i\}_{i=1}^N$ and $Y=\{ y_i \}_{i=1}^N$, a robustness degree $\rho$, and disjoint intervals $L_1, \ldots, L_m$, where each $x \in X$ is i.i.d. from a density $p(x)$ and $\bm{\pi}(x)$ denotes the where $\hat{\bm{\pi}}_E(x)$ and $\hat{\bm{\pi}}_V(x)$ are the mean and covariance matrix of $Dir(\c

Figures (2)

  • Figure 1: (a) An illustrative example of an environment with two obstacles (black rectangles) and yielded $500$ trajectories. The input region $\mathcal{X} = [0,10]^2$ is given as the set of initial starting locations of the robot. The goal location of the moving robot is given by $(35,5)$. (b) The $500$ inputs sampled in $\mathcal{X}$ through (\ref{['sample_dist']}).
  • Figure 2: (a) Mean of the index $Ind(c)$ in (\ref{['index']}) for $20$ estimations. Each shaded region represents the standard deviation. We omit plots for which the corresponding $\mathcal{X}(c)$ are empty. (b) Mean of the index $CredRatio(c)$ in (\ref{['cred_ratio']}) for $20$ estimations. No sample at $c=1$ represents the value of $CredRatio(c)$ over the area where no sample is obtained.

Theorems & Definitions (6)

  • Remark 1
  • Remark 2: Summary
  • Theorem 1
  • Lemma 1
  • proof
  • proof : Proof of Theorem 1