Table of Contents
Fetching ...

PCA-DDReach: Efficient Statistical Reachability Analysis of Stochastic Dynamical Systems via Principal Component Analysis

Navid Hashemi, Lars Lindemann, Jyotirmoy Deshmukh

TL;DR

This work addresses scalable, data-driven reachability for stochastic dynamical systems by combining conformal inference with Principal Component Analysis (PCA). It introduces a per-segment surrogate training strategy to improve scalability and an PCA-based residual framing to reduce conservatism in inflated reachability bound computation, yielding a $\delta$-confident flowpipe even under distribution shifts. The approach is validated on a 12‑D quadcopter and a 27‑D hybrid powertrain, demonstrating tighter inflations and accurate probabilistic reachability compared with prior methods. The results have practical impact for safety-critical CPS where accurate probabilistic guarantees are essential and data-driven models are necessary due to model uncertainty or black-box dynamics.

Abstract

This study presents a scalable data-driven algorithm designed to efficiently address the challenging problem of reachability analysis. Analysis of cyber-physical systems (CPS) relies typically on parametric physical models of dynamical systems. However, identifying parametric physical models for complex CPS is challenging due to their complexity, uncertainty, and variability, often rendering them as black-box oracles. As an alternative, one can treat these complex systems as black-box models and use trajectory data sampled from the system (e.g., from high-fidelity simulators or the real system) along with machine learning techniques to learn models that approximate the underlying dynamics. However, these machine learning models can be inaccurate, highlighting the need for statistical tools to quantify errors. Recent advancements in the field include the incorporation of statistical uncertainty quantification tools such as conformal inference (CI) that can provide probabilistic reachable sets with provable guarantees. Recent work has even highlighted the ability of these tools to address the case where the distribution of trajectories sampled during training time are different from the distribution of trajectories encountered during deployment time. However, accounting for such distribution shifts typically results in more conservative guarantees. This is undesirable in practice and motivates us to present techniques that can reduce conservatism. Here, we propose a new approach that reduces conservatism and improves scalability by combining conformal inference with Principal Component Analysis (PCA). We show the effectiveness of our technique on various case studies, including a 12-dimensional quadcopter and a 27-dimensional hybrid system known as the powertrain.

PCA-DDReach: Efficient Statistical Reachability Analysis of Stochastic Dynamical Systems via Principal Component Analysis

TL;DR

This work addresses scalable, data-driven reachability for stochastic dynamical systems by combining conformal inference with Principal Component Analysis (PCA). It introduces a per-segment surrogate training strategy to improve scalability and an PCA-based residual framing to reduce conservatism in inflated reachability bound computation, yielding a -confident flowpipe even under distribution shifts. The approach is validated on a 12‑D quadcopter and a 27‑D hybrid powertrain, demonstrating tighter inflations and accurate probabilistic reachability compared with prior methods. The results have practical impact for safety-critical CPS where accurate probabilistic guarantees are essential and data-driven models are necessary due to model uncertainty or black-box dynamics.

Abstract

This study presents a scalable data-driven algorithm designed to efficiently address the challenging problem of reachability analysis. Analysis of cyber-physical systems (CPS) relies typically on parametric physical models of dynamical systems. However, identifying parametric physical models for complex CPS is challenging due to their complexity, uncertainty, and variability, often rendering them as black-box oracles. As an alternative, one can treat these complex systems as black-box models and use trajectory data sampled from the system (e.g., from high-fidelity simulators or the real system) along with machine learning techniques to learn models that approximate the underlying dynamics. However, these machine learning models can be inaccurate, highlighting the need for statistical tools to quantify errors. Recent advancements in the field include the incorporation of statistical uncertainty quantification tools such as conformal inference (CI) that can provide probabilistic reachable sets with provable guarantees. Recent work has even highlighted the ability of these tools to address the case where the distribution of trajectories sampled during training time are different from the distribution of trajectories encountered during deployment time. However, accounting for such distribution shifts typically results in more conservative guarantees. This is undesirable in practice and motivates us to present techniques that can reduce conservatism. Here, we propose a new approach that reduces conservatism and improves scalability by combining conformal inference with Principal Component Analysis (PCA). We show the effectiveness of our technique on various case studies, including a 12-dimensional quadcopter and a 27-dimensional hybrid system known as the powertrain.

Paper Structure

This paper contains 18 sections, 2 theorems, 18 equations, 6 figures, 1 table.

Key Result

Lemma 3

Let $\bar{X}$ be a surrogate flowpipe of the surrogate model $\mathcal{F}$ for the set of initial conditions $\mathcal{I}$. Let $\mathsf{PE}:=\left[ R^{1} , R^{2}, \ldots , R^{n\mathrm{K}} \right]$ be the sequence of prediction errors for $\sigma^{\mathsf{real}}_{s_0} \sim \mathcal{D}_{S,\mathrm{K}}

Figures (6)

  • Figure 1: This figure shows the division of the trajectory into $N$ different segments $\sigma^{\mathsf{sim} , q}_{s_0}, q\in[N]$
  • Figure 2: The figure shows the projection of prediction errors for two-dimensional states over a horizon of $\mathrm{K} = 2$. The left figure illustrates the projection on the $(R^1, R^2)$ axes (e.g., $k=1$), and the right figure displays the projection on the $(R^3, R^4)$ axes (e.g., $k=2$). This figure provides a comparison between the inflating hypercubes for a confidence level $\delta \in (0,1)$, generated by the PCA approach (red hypercubes) and the method proposed in hashemi2024statistical (green hypercubes). It clearly demonstrates the superior accuracy of the PCA technique compared to the other method. The principal axes for $k=1,2$ are $(r^1, r^2)$ and $(r^3, r^4)$, respectively.
  • Figure 3: Shows the comparison with hashemi2024statistical. The blue and red borders are projections of our and their $\delta$-confident flowpipes respectively with $\delta = 99.99\%$. The shaded regions show the density of the trajectories from $\mathcal{T}^{\mathsf{trn}}$.
  • Figure 4: Shows the projection of our $\delta$-confident flowpipe on each component of the trajectory state. The shaded area are the simulation of trajectories from $\mathcal{T}^{\mathsf{trn}}$.
  • Figure 5: Shows the projection of our $\delta$-confident flowpipe on the first $8$ components of the trajectory state. There is a shift between the distribution of deployment and training environments. The shaded area are the trajectories sampled from the deployment environment.
  • ...and 1 more figures

Theorems & Definitions (7)

  • Definition 1: Simulation & Real Residual Distribution
  • Definition 2: Star set bak2017simulation
  • Lemma 3
  • Definition 4: Calibration Dataset
  • Proposition 5
  • proof
  • Remark 6