Table of Contents
Fetching ...

Bounds for the tracking error of first-order online optimization methods

Liam Madden, Stephen Becker, Emiliano Dall'Anese

Abstract

This paper investigates online algorithms for smooth time-varying optimization problems, focusing first on methods with constant step-size, momentum, and extrapolation-length. Assuming strong convexity, precise results for the tracking iterate error (the limit supremum of the norm of the difference between the optimal solution and the iterates) for online gradient descent are derived. The paper then considers a general first-order framework, where a universal lower bound on the tracking iterate error is established. Furthermore, a method using "long-steps" is proposed and shown to achieve the lower bound up to a fixed constant. This method is then compared with online gradient descent for specific examples. Finally, the paper analyzes the effect of regularization when the cost is not strongly convex. With regularization, it is possible to achieve a non-regret bound. The paper ends by testing the accelerated and regularized methods on synthetic time-varying least-squares and logistic regression problems, respectively.

Bounds for the tracking error of first-order online optimization methods

Abstract

This paper investigates online algorithms for smooth time-varying optimization problems, focusing first on methods with constant step-size, momentum, and extrapolation-length. Assuming strong convexity, precise results for the tracking iterate error (the limit supremum of the norm of the difference between the optimal solution and the iterates) for online gradient descent are derived. The paper then considers a general first-order framework, where a universal lower bound on the tracking iterate error is established. Furthermore, a method using "long-steps" is proposed and shown to achieve the lower bound up to a fixed constant. This method is then compared with online gradient descent for specific examples. Finally, the paper analyzes the effect of regularization when the cost is not strongly convex. With regularization, it is possible to achieve a non-regret bound. The paper ends by testing the accelerated and regularized methods on synthetic time-varying least-squares and logistic regression problems, respectively.

Paper Structure

This paper contains 14 sections, 7 theorems, 32 equations, 5 figures, 1 table, 1 algorithm.

Key Result

Theorem 3.1

Suppose that $(f_t)\in \mathcal{S}(\kappa^{-1},L,\sigma)$ and let $\alpha\in ]0,2/(\mu+L)]$. Then, given $x_0$, $\texttt{ALG}(\alpha,0,0)$ constructs a sequence $(x_t)$ such that where $\mu=\kappa^{-1}L$. In particular, the bound is minimized for $\alpha=\frac{2}{\mu+L}$, in which case,

Figures (5)

  • Figure 1: Movement of iterates and minimizers
  • Figure 2: Algorithms applied to the online Nesterov function with $L=500$, $d=1000$, $a=(L+\mu)/2$, $\sigma =1$, and $T=\lfloor(2+\sqrt{2})\sqrt{\kappa}\rfloor$. (a) Evolution of the iterate error for the particular example $\mu=1$. (b) Tracking iterate error for varying $\mu$.
  • Figure 3: Algorithms applied to the translating quadratic function with $d=2$, $\sigma =1$, and $T=\lfloor(2+\sqrt{2})\sqrt{\kappa}\rfloor$. (a) shows the evolution of the iterate error for $L=500$ and $\mu=1$. (b) shows the tracking iterate error for $L=1$ and varying $\mu$.
  • Figure 4: Least-squares regression with random data matrix and random walk variation of the minimizer.
  • Figure 5: Logistic regression with random data matrix and randomly flipping labels.

Theorems & Definitions (11)

  • Theorem 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Theorem 4.1
  • proof
  • Remark 4.1
  • Theorem 4.2
  • proof
  • Lemma 5.1
  • Theorem 5.1
  • ...and 1 more