Table of Contents
Fetching ...

Safe Non-Stochastic Control of Control-Affine Systems: An Online Convex Optimization Approach

Hongyu Zhou, Yichen Song, Vasileios Tzoumas

TL;DR

Safe-NSC tackles the problem of safely controlling nonlinear, control-affine systems under bounded non-stochastic disturbances with time-varying safety constraints. It introduces Safe-OGD, an online gradient-descent based algorithm that uses discrete-time control barrier functions to preserve safety while minimizing a convex loss, achieving bounded dynamic regret against a clairvoyant oracle. The framework extends to linear policy parameterizations and provides regret bounds that recover classical OCO results in time-invariant settings and converge to optimal linear controllers in static scenarios. Evaluation on an inverted pendulum and a cluttered quadrotor demonstrates safety guarantees and competitive tracking performance against strong baselines. This work advances real-time, provably safe autonomy under non-stochastic disturbances for nonlinear, control-affine systems, with practical implications for robust autonomous robotics.

Abstract

We study how to safely control nonlinear control-affine systems that are corrupted with bounded non-stochastic noise, i.e., noise that is unknown a priori and that is not necessarily governed by a stochastic model. We focus on safety constraints that take the form of time-varying convex constraints such as collision-avoidance and control-effort constraints. We provide an algorithm with bounded dynamic regret, i.e., bounded suboptimality against an optimal clairvoyant controller that knows the realization of the noise a prior. We are motivated by the future of autonomy where robots will autonomously perform complex tasks despite real-world unpredictable disturbances such as wind gusts. To develop the algorithm, we capture our problem as a sequential game between a controller and an adversary, where the controller plays first, choosing the control input, whereas the adversary plays second, choosing the noise's realization. The controller aims to minimize its cumulative tracking error despite being unable to know the noise's realization a prior. We validate our algorithm in simulated scenarios of (i) an inverted pendulum aiming to stay upright, and (ii) a quadrotor aiming to fly to a goal location through an unknown cluttered environment.

Safe Non-Stochastic Control of Control-Affine Systems: An Online Convex Optimization Approach

TL;DR

Safe-NSC tackles the problem of safely controlling nonlinear, control-affine systems under bounded non-stochastic disturbances with time-varying safety constraints. It introduces Safe-OGD, an online gradient-descent based algorithm that uses discrete-time control barrier functions to preserve safety while minimizing a convex loss, achieving bounded dynamic regret against a clairvoyant oracle. The framework extends to linear policy parameterizations and provides regret bounds that recover classical OCO results in time-invariant settings and converge to optimal linear controllers in static scenarios. Evaluation on an inverted pendulum and a cluttered quadrotor demonstrates safety guarantees and competitive tracking performance against strong baselines. This work advances real-time, provably safe autonomy under non-stochastic disturbances for nonlinear, control-affine systems, with practical implications for robust autonomous robotics.

Abstract

We study how to safely control nonlinear control-affine systems that are corrupted with bounded non-stochastic noise, i.e., noise that is unknown a priori and that is not necessarily governed by a stochastic model. We focus on safety constraints that take the form of time-varying convex constraints such as collision-avoidance and control-effort constraints. We provide an algorithm with bounded dynamic regret, i.e., bounded suboptimality against an optimal clairvoyant controller that knows the realization of the noise a prior. We are motivated by the future of autonomy where robots will autonomously perform complex tasks despite real-world unpredictable disturbances such as wind gusts. To develop the algorithm, we capture our problem as a sequential game between a controller and an adversary, where the controller plays first, choosing the control input, whereas the adversary plays second, choosing the noise's realization. The controller aims to minimize its cumulative tracking error despite being unable to know the noise's realization a prior. We validate our algorithm in simulated scenarios of (i) an inverted pendulum aiming to stay upright, and (ii) a quadrotor aiming to fly to a goal location through an unknown cluttered environment.
Paper Structure (25 sections, 3 theorems, 16 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 25 sections, 3 theorems, 16 equations, 3 figures, 2 tables, 1 algorithm.

Key Result

Lemma 1

The loss function $c_{t}\left(x_{t+1}, u_{t}\right) : \mathbb{R}^{d_x} \times \mathbb{R}^{d_u} {\color{blue}\rightarrow} \mathbb{R}$ is convex in the control input $u_t$, given functions $f\left(\cdot\right)$ and $g\left(\cdot\right)$, $x_t$, and $w_t$.

Figures (3)

  • Figure 1: Safe non-stochastic control example: Autonomous flight in cluttered environments subject to unknown wind disturbances. In this paper, we focus on safe non-stochastic control of control-affine systems where the robots' capacity to select effective control actions fast is challenged by (i) time-varying safety constraints, (ii) unknown, unstructured, and, more broadly, unpredictable noise, and (iii) nonlinear control-affine dynamics. For example, in package delivery with quadrotors, the quadrotors are required to fly to goal positions. But during such tasks, (i) the quadrotors need to ensure collision avoidance at all times, which requires control actions that respect time-varying state and control-input constraints, (ii) the quadrotors may be disturbed by unpredictable wind gusts, and (iii) they need to account for their nonlinear, in particular, control-affine dynamics. These challenges stress the quadrotors' ability to decide effective control inputs fast, and to ensure safety. We aim to provide a control algorithm that handles these challenges, guaranteeing bounded suboptimality against optimal safe controllers in hindsight.
  • Figure 2: Autonomous system architecture in \ref{['subsec:sim-2']}.
  • Figure 3: Simulation results with goal position $\left[10 \ 10 \ 1 \right]^\top$ and disturbances $\left[3 \ 3 \ 0 \right]^\top$ in \ref{['subsec:sim-2']}. The black line is the trajectory, the blue line is the reference trajectory, the red zone is the area where the external forces are applied to the quadrotor, the shaded polytope is the safety constraint, and the gold star is the goal position. (a) lee2010geometric collides with obstacles and often has poor safety rate; (b) wu2021external collides with obstacles due to latency of R-NMPC; (d) R-NMPC in wu2021external has longer flight time and trajectory length since it aims to guarantee safety against worst-case disturbances over a lookahead horizon; (c) & (e) Our method achieves collision avoidance while having better performance in flight time, trajectory length, and tracking error.

Theorems & Definitions (8)

  • Remark 1: Removal of the Stability Condition
  • Definition 1: Dynamic Regret
  • Definition 2: Discrete-Time Exponentially Control Barrier Function) (DCBF) agrawal2017discrete
  • Lemma 1: Convexity of Loss function in Control Input
  • Lemma 2: Construction of Time-Varying Domain Set with Safety Guarantee
  • Theorem 1: Dynamic Policy Regret Bound of \ref{['alg:SafeOGD_Control']}
  • Remark 2: Optimality under Time-Invariant Domain of Optimization
  • Remark 3: Optimality under also Time-Invariant Control Policies