Constraint-Guided Online Data Selection for Scalable Data-Driven Safety Filters in Uncertain Robotic Systems

Jason J. Choi; Fernando Castañeda; Wonsuhk Jung; Bike Zhang; Claire J. Tomlin; Koushil Sreenath

Constraint-Guided Online Data Selection for Scalable Data-Driven Safety Filters in Uncertain Robotic Systems

Jason J. Choi, Fernando Castañeda, Wonsuhk Jung, Bike Zhang, Claire J. Tomlin, Koushil Sreenath

TL;DR

The findings reveal that the efficient online data selection algorithm, which strategically selects key data points, enhances the practicality and efficiency of data-driven certifying filters in complex robotic systems, significantly mitigating scalability concerns inherent in nonparametric learning-based control methods.

Abstract

As the use of autonomous robots expands in tasks that are complex and challenging to model, the demand for robust data-driven control methods that can certify safety and stability in uncertain conditions is increasing. However, the practical implementation of these methods often faces scalability issues due to the growing amount of data points with system complexity, and a significant reliance on high-quality training data. In response to these challenges, this study presents a scalable data-driven controller that efficiently identifies and infers from the most informative data points for implementing data-driven safety filters. Our approach is grounded in the integration of a model-based certificate function-based method and Gaussian Process (GP) regression, reinforced by a novel online data selection algorithm that reduces time complexity from quadratic to linear relative to dataset size. Empirical evidence, gathered from successful real-world cart-pole swing-up experiments and simulated locomotion of a five-link bipedal robot, demonstrates the efficacy of our approach. Our findings reveal that our efficient online data selection algorithm, which strategically selects key data points, enhances the practicality and efficiency of data-driven certifying filters in complex robotic systems, significantly mitigating scalability concerns inherent in nonparametric learning-based control methods.

Constraint-Guided Online Data Selection for Scalable Data-Driven Safety Filters in Uncertain Robotic Systems

TL;DR

Abstract

Paper Structure (31 sections, 7 theorems, 74 equations, 7 figures, 1 table, 1 algorithm)

This paper contains 31 sections, 7 theorems, 74 equations, 7 figures, 1 table, 1 algorithm.

Introduction
Motivation
Contributions
Notations
Related Work
Certifying Filters for Uncertain Systems
Uncertain Dynamics and Certifying Filter
Certificate Function-based Design
Data-driven Certifying Filters
Gaussian Process Regression
Second-order Cone Program-based Certifying Filters
Running Example: 2D Polynomial System (Polysys)
Constraint Guided Online Data Selection
Preliminaries
Simplified notations for kernels
...and 16 more sections

Key Result

Lemma 1

Given a dataset $\mathbb{D}_N$, for a point $x \in \mathcal{X}$, If there exists a constant $\alpha > 0$ such that the following inequality holds, then the GP-CF-SOCP in eq:gp-cbf-socp is feasible. The feasible control input can be found by taking $u = \alpha' \widehat{L_g C}(x|\mathbb{D}_N)^\top$ with sufficiently large $\alpha' > 0$.

Figures (7)

Figure 1: The simulation result of the Polysys example under various controllers: the nominal model-based CLF-QP ( magenta), the oracle CLF-QP ( blue), the GP-CLF-SOCP using full data (black). The topmost plot illustrates the trajectory's progression in the state space for 2.5 seconds, with an initial state of $x_0=[-0.4\; 0.6]^\top$, while the data is depicted as grey dots. The three subplots on the bottom show the state $x_1$, $x_2$, and the CLF values respectively. While the trajectory quickly diverges under the nominal model-based CLF-QP, the GP-CLF-SOCP successfully stabilizes the trajectory to the origin.
Figure 2: Comparison between the two data selection strategies--(a) naive approach described in Section \ref{['subsec:naive']} and (b) our main algorithm described in Section \ref{['subsec:coca']}, on the Polysys running example system, with a varying number of online selected data points ($M=5, 10, 20$). In each case, the first row visualizes the entire dataset $\mathbb{D}_N$ (grey dots) projected on the state space and the data points selected online $\mathbb{D}_M$ (magenta dots) according to the data selection algorithm at the query state $x = [-0.02 \; 1.10]^\top$ (orange diamond). The second row visualizes the selected points projected on the control input space (magenta dots), and the prediction uncertainty $\beta\sigma(x, u)$'s growth in the control space as an ellipse. We also visualize $\widehat{L_g C}(x|\mathbb{D}_N)$ and $\widehat{L_g C}(x|\mathbb{D}_M)$ as the dashed green and magenta lines, respectively. The ellipse and magenta line represent the growth of the right-hand side and left-hand side of \ref{['eq:socp-cbf-constraint2']}, respectively. The feasibility of the chance certifying constraint can be deduced by evaluating the relative ratio of the magenta line's length to the ellipse's radial distance in the magenta line's direction. A smaller ratio suggests that a larger control input in the $\widehat{L_g C}(x|\mathbb{D}_M)$ direction is required to satisfy the chance constraint.
Figure 3: The simulation result of the Polysys example under the GP-CLF-SOCP controllers using: naive data selection ( orange) discussed in Sec. \ref{['subsec:naive']}, FITC snelson2005sparse whose inducing points are evenly spaced and fixed ( grey), dataset selected from Algorithm \ref{['algo:sparsegp']} ( green) discussed in Sec. \ref{['subsec:coca']}, and FITC whose inducing points are selected by Algorithm \ref{['algo:sparsegp']} ( purple). All controllers use the same number of online data or the number of inducing points, $M=20$. While the naive approaches often face infeasibility and fail to stabilize the system close to the origin, our approaches effectively select an online dataset that secures the feasibility of the SOCP.
Figure 4: (a) The configuration of the planar five-link bipedal robot RABBIT chevallereau2003rabbit (b) Cart-pole experiment setup based on Quanser Linear Servo Base Unit with Inverted Pendulum quanser_products_2021.
Figure 5: Simulation results of RABBIT achieving stable walking under various controllers: the nominal model-based CLF-QP ( magenta), the oracle CLF-QP ( blue), GP-CLF-SOCP (Full) (black), and GP-CLF-SOCP (Ours) ( green). The left column depicts histories of the Euclidean norm of the tracking error $y$ and its time derivative $\dot{y}$ with respect to the reference gait. The right column shows the evolution of the hip's vertical position from the ground and the value of the CLF $V(x)$.
...and 2 more figures

Theorems & Definitions (21)

Definition 1
Remark 1
Definition 2
Definition 3
Remark 2
Lemma 1
proof
Lemma 2
proof
Remark 3
...and 11 more

Constraint-Guided Online Data Selection for Scalable Data-Driven Safety Filters in Uncertain Robotic Systems

TL;DR

Abstract

Constraint-Guided Online Data Selection for Scalable Data-Driven Safety Filters in Uncertain Robotic Systems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (21)