A robust BFGS algorithm for unconstrained nonlinear optimization problems
Yaguang Yang
TL;DR
This work introduces a robust BFGS algorithm for unconstrained nonlinear optimization that globally converges to a stationary point for any twice differentiable function, convex or non-convex. The method forms a robust Hessian estimate $Q(x_k)= \gamma_k I + (1-\gamma_k) H(x_k)$ with a dynamically chosen $\gamma_k$, and uses a BFGS-like update for the robust matrix $E_k$ to preserve descent directions while keeping computational cost similar to classic BFGS. Under standard smoothness and second-order conditions, the algorithm is globally convergent and, near a local minimizer with a strongly positive definite Hessian, it reduces to BFGS and achieves superlinear convergence. Numerical tests on the CUTEst set show that the proposed method is robust, efficient, and competitive with fminunc, L-BFGS, and conjugate gradient variants.
Abstract
In this paper, a modified BFGS algorithm is proposed. The modified BFGS matrix estimates a modified Hessian matrix which is a convex combination of an identity matrix for the steepest descent algorithm and a Hessian matrix for the Newton algorithm. The coefficient of the convex combination in the modified BFGS algorithm is dynamically chosen in every iteration. It is proved that, for any twice differentiable nonlinear function (convex or non-convex), the algorithm is globally convergent to a stationary point. If the stationary point is a local optimizer where the Hessian is strongly positive definite in a neighborhood of the optimizer, the iterates will eventually enter and stay in the neighborhood, and the modified BFGS algorithm reduces to the BFGS algorithm in this neighborhood. Therefore, the modified BFGS algorithm is super-linearly convergent. Moreover, the computational cost of the modified BFGS in each iteration is almost the same as the cost of the BFGS. Numerical test on the CUTE test set is reported. The performance of the modified BFGS algorithm implemented in our MATLAB function is compared to the BFGS algorithm implemented in the MATLAB Optimization Toolbox function, a limited memory BFGS implemented as L-BFGS, a descent conjugate gradient algorithm implemented as CG-Descent 5.3, and a limited memory, descent and conjugate algorithm implemented as L-CG-Descent. This result shows that the modified BFGS algorithm may be very effective.
