Predictive Linear Online Tracking for Unknown Targets

Anastasios Tsiamis; Aren Karapetyan; Yueshan Li; Efe C. Balta; John Lygeros

Predictive Linear Online Tracking for Unknown Targets

Anastasios Tsiamis, Aren Karapetyan, Yueshan Li, Efe C. Balta, John Lygeros

TL;DR

This work tackles online tracking of unknown, time-varying targets in linear systems with quadratic costs. It introduces PLOT, which learns time-varying target dynamics via recursive least squares with forgetting and integrates these predictions into a receding horizon control framework under certainty equivalence. A dynamic regret analysis shows the algorithm achieves ${\mathcal{R}(\pi) \le O(\sqrt{T V_T})}$, with logarithmic regret in the static case $V_T=0$, where $V_T$ is the total variation of the target dynamics. The authors validate PLOT extensively in simulations and on a Crazyflie quadrotor, providing open-source software and demonstrating practical viability of online non-stochastic control on real hardware.

Abstract

In this paper, we study the problem of online tracking in linear control systems, where the objective is to follow a moving target. Unlike classical tracking control, the target is unknown, non-stationary, and its state is revealed sequentially, thus, fitting the framework of online non-stochastic control. We consider the case of quadratic costs and propose a new algorithm, called predictive linear online tracking (PLOT). The algorithm uses recursive least squares with exponential forgetting to learn a time-varying dynamic model of the target. The learned model is used in the optimal policy under the framework of receding horizon control. We show the dynamic regret of PLOT scales with $\mathcal{O}(\sqrt{TV_T})$, where $V_T$ is the total variation of the target dynamics and $T$ is the time horizon. Unlike prior work, our theoretical results hold for non-stationary targets. We implement PLOT on a real quadrotor and provide open-source software, thus, showcasing one of the first successful applications of online control methods on real hardware.

Predictive Linear Online Tracking for Unknown Targets

TL;DR

, with logarithmic regret in the static case

, where

is the total variation of the target dynamics. The authors validate PLOT extensively in simulations and on a Crazyflie quadrotor, providing open-source software and demonstrating practical viability of online non-stochastic control on real hardware.

Abstract

, where

is the total variation of the target dynamics and

is the time horizon. Unlike prior work, our theoretical results hold for non-stationary targets. We implement PLOT on a real quadrotor and provide open-source software, thus, showcasing one of the first successful applications of online control methods on real hardware.

Paper Structure (50 sections, 14 theorems, 120 equations, 23 figures, 1 table, 4 algorithms)

This paper contains 50 sections, 14 theorems, 120 equations, 23 figures, 1 table, 4 algorithms.

Introduction
Contribution
Dynamic regret for online tracking.
Prediction of time-varying partially observed systems.
Experimental demonstration.
Organization and Notation
Problem Statement
Control Objective
LQT Optimal Controller
Predictive Linear Online Tracking
Target Prediction.
Receding Horizon Control.
Dynamic Regret and Tuning
Simulations and Experimental Validation
Simulation Results
...and 35 more sections

Key Result

Theorem 4.1

Select a prediction horizon $W$ and a forgetting factor $\gamma\in(0, 1)$. Let $\rho$ be the decay rate of the LQT gains as in eq:LQT_gains_exponenitally_decaying and let $\tilde{W}=\min\{(1-\rho)^{-1},W\}$. The dynamic regret of the PLOT policy, as given by Algorithm alg:mpc, is upper bounded by where $\alpha_1,\alpha_2,\alpha_3,\alpha_4$ (given in eq:alpha_coeffs) are positive constants related

Figures (23)

Figure 1: Trajectory plots of a circular target with a $V_T=0$ path length and the PLOT Algorithm for varying prediction horizon lengths, simulated for $T=2,3,5$ and $T=7$ seconds.
Figure 2: Log-normalized regret of the PLOT Algorithm with a range of prediction horizon lengths, simulated over a horizon of $T = 200$ seconds.
Figure 3: Trajectory plot of a spiral with a $V_T = \mathcal{O}(\sqrt{T})$ tracked with PLOT with a $W=5$ and a range of values for $\gamma$.
Figure 4: Regret of PLOT with varying $\gamma_a = 1-c_\gamma T^{-a}$; $\gamma_{0.25}$ is regret-optimal based on Corollary \ref{['corollary']}.
Figure 5: Dynamic Regret of online control algorithms applied to the online tracking problem.
...and 18 more figures

Theorems & Definitions (27)

Remark 3.1: Improper Learning versus Single Learner
Theorem 4.1: Dynamic Regret
Corollary 4.2: Tuning
Remark 4.3: Path Length and Complexity
Lemma 4.4: Performance Difference Lemma foster2020logarithmic
Theorem 4.5: Regret for AR system prediction
Proposition 2.1: Stability zhang2021regret
Example 3.1: Constant Velocity Target
Example 3.2: Circular Target with Constant Speed
Lemma 3.3
...and 17 more

Predictive Linear Online Tracking for Unknown Targets

TL;DR

Abstract

Predictive Linear Online Tracking for Unknown Targets

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (23)

Theorems & Definitions (27)