Automatic Differentiation of Optimization Algorithms with Time-Varying Updates
Sheheryar Mehmood, Peter Ochs
TL;DR
This paper applies unrolled or automatic differentiation to a time-varying iterative process and provides convergence (rate) guarantees for the resulting derivative iterates and adapts these results to proximal gradient descent with variable step size and FISTA when solving partly smooth problems.
Abstract
Numerous Optimization Algorithms have a time-varying update rule thanks to, for instance, a changing step size, momentum parameter or, Hessian approximation. In this paper, we apply unrolled or automatic differentiation to a time-varying iterative process and provide convergence (rate) guarantees for the resulting derivative iterates. We adapt these convergence results and apply them to proximal gradient descent with variable step size and FISTA when solving partly smooth problems. We confirm our findings numerically by solving $\ell_1$ and $\ell_2$-regularized linear and logisitc regression respectively. Our theoretical and numerical results show that the convergence rate of the algorithm is reflected in its derivative iterates.
