Table of Contents
Fetching ...

Second-Order Optimization via Quiescence

Aayushya Agarwal, Larry Pileggi, Ronald Rohrer

TL;DR

This work proposes a second-order optimization method that utilizes a dynamic system model to represent the trajectory of optimization variables as an ODE and adaptively selects large step-sizes that sequentially follow each optimization variable to a quasi-steady state until all state variables reach the actual steady state, coinciding with the optimum.

Abstract

Second-order optimization methods exhibit fast convergence to critical points, however, in nonconvex optimization, these methods often require restrictive step-sizes to ensure a monotonically decreasing objective function. In the presence of highly nonlinear objective functions with large Lipschitz constants, increasingly small step-sizes become a bottleneck to fast convergence. We propose a second-order optimization method that utilizes a dynamic system model to represent the trajectory of optimization variables as an ODE. We then follow the quasi-steady state trajectory by forcing variables with the fastest rise time into a state known as quiescence. This optimization via quiescence allows us to adaptively select large step-sizes that sequentially follow each optimization variable to a quasi-steady state until all state variables reach the actual steady state, coinciding with the optimum. The result is a second-order method that utilizes large step-sizes and does not require a monotonically decreasing objective function to reach a critical point. Experimentally, we demonstrate the fast convergence of this approach for optimizing nonconvex problems in power systems and compare them to existing state-of-the-art second-order methods, including damped Newton-Raphson, BFGS, and SR1.

Second-Order Optimization via Quiescence

TL;DR

This work proposes a second-order optimization method that utilizes a dynamic system model to represent the trajectory of optimization variables as an ODE and adaptively selects large step-sizes that sequentially follow each optimization variable to a quasi-steady state until all state variables reach the actual steady state, coinciding with the optimum.

Abstract

Second-order optimization methods exhibit fast convergence to critical points, however, in nonconvex optimization, these methods often require restrictive step-sizes to ensure a monotonically decreasing objective function. In the presence of highly nonlinear objective functions with large Lipschitz constants, increasingly small step-sizes become a bottleneck to fast convergence. We propose a second-order optimization method that utilizes a dynamic system model to represent the trajectory of optimization variables as an ODE. We then follow the quasi-steady state trajectory by forcing variables with the fastest rise time into a state known as quiescence. This optimization via quiescence allows us to adaptively select large step-sizes that sequentially follow each optimization variable to a quasi-steady state until all state variables reach the actual steady state, coinciding with the optimum. The result is a second-order method that utilizes large step-sizes and does not require a monotonically decreasing objective function to reach a critical point. Experimentally, we demonstrate the fast convergence of this approach for optimizing nonconvex problems in power systems and compare them to existing state-of-the-art second-order methods, including damped Newton-Raphson, BFGS, and SR1.

Paper Structure

This paper contains 19 sections, 5 theorems, 64 equations, 3 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

$-\frac{\dot{x}_{nq_i}(0)}{\ddot{x}_{nq_i}(0)}$ approximates the first-order time constant of each non-quiescent state variable, $x_{nq_i}$.

Figures (3)

  • Figure 1: Solving a Rosenbrock function using an explicit numerical integration method (Matlab ODE23) results in numerical oscillations/divergence from the optimum. An FE with a smaller time-step (1.5e-3 s) converges to the local optimum but requires 5425 iterations (time-steps).
  • Figure 2: Transient response of gradient-flow using a) FE with time-step of 1s and b) OptiQ
  • Figure 3: Comparison of second-order methods in optimizing test functions

Theorems & Definitions (11)

  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • proof
  • Theorem 4
  • proof
  • Lemma 5
  • proof
  • ...and 1 more