Safe Non-Stochastic Control of Control-Affine Systems: An Online Convex Optimization Approach
Hongyu Zhou, Yichen Song, Vasileios Tzoumas
TL;DR
Safe-NSC tackles the problem of safely controlling nonlinear, control-affine systems under bounded non-stochastic disturbances with time-varying safety constraints. It introduces Safe-OGD, an online gradient-descent based algorithm that uses discrete-time control barrier functions to preserve safety while minimizing a convex loss, achieving bounded dynamic regret against a clairvoyant oracle. The framework extends to linear policy parameterizations and provides regret bounds that recover classical OCO results in time-invariant settings and converge to optimal linear controllers in static scenarios. Evaluation on an inverted pendulum and a cluttered quadrotor demonstrates safety guarantees and competitive tracking performance against strong baselines. This work advances real-time, provably safe autonomy under non-stochastic disturbances for nonlinear, control-affine systems, with practical implications for robust autonomous robotics.
Abstract
We study how to safely control nonlinear control-affine systems that are corrupted with bounded non-stochastic noise, i.e., noise that is unknown a priori and that is not necessarily governed by a stochastic model. We focus on safety constraints that take the form of time-varying convex constraints such as collision-avoidance and control-effort constraints. We provide an algorithm with bounded dynamic regret, i.e., bounded suboptimality against an optimal clairvoyant controller that knows the realization of the noise a prior. We are motivated by the future of autonomy where robots will autonomously perform complex tasks despite real-world unpredictable disturbances such as wind gusts. To develop the algorithm, we capture our problem as a sequential game between a controller and an adversary, where the controller plays first, choosing the control input, whereas the adversary plays second, choosing the noise's realization. The controller aims to minimize its cumulative tracking error despite being unable to know the noise's realization a prior. We validate our algorithm in simulated scenarios of (i) an inverted pendulum aiming to stay upright, and (ii) a quadrotor aiming to fly to a goal location through an unknown cluttered environment.
