Affine-Invariant Global Non-Asymptotic Convergence Analysis of BFGS under Self-Concordance
Qiujiang Jin, Aryan Mokhtari
TL;DR
This work establishes affine-invariant, non-asymptotic global convergence guarantees for the BFGS quasi-Newton method when optimizing a strictly convex, twice-differentiable function that is $M$-strongly self-concordant. By leveraging Armijo-Wolfe line search and weighted reformulations, the authors derive global linear and global superlinear rates that depend only on the initial suboptimality and the self-concordant constant, not on strong convexity or Lipschitz gradient/Hessian assumptions. A key contribution is proving that, after a finite iteration index, the unit step becomes admissible and yields a $(C/t)^t$-type superlinear rate, while preserving affine invariance across linear changes of variables. The theory is complemented by numerical experiments on hard cubic and logistic regression objectives, confirming the predicted convergence behavior and the affine-invariance property. These results extend global non-asymptotic convergence theory for BFGS to a broad, affine-invariant class of self-concordant functions and offer practical implications for line-search-based quasi-Newton optimization.
Abstract
In this paper, we establish global non-asymptotic convergence guarantees for the BFGS quasi-Newton method without requiring strong convexity or the Lipschitz continuity of the gradient or Hessian. Instead, we consider the setting where the objective function is strictly convex and strongly self-concordant. For an arbitrary initial point and any arbitrary positive-definite initial Hessian approximation, we prove global linear and superlinear convergence guarantees for BFGS when the step size is determined using a line search scheme satisfying the weak Wolfe conditions. Moreover, all our global guarantees are affine-invariant, with the convergence rates depending solely on the initial error and the strongly self-concordant constant. Our results extend the global non-asymptotic convergence theory of BFGS beyond traditional assumptions and, for the first time, establish affine-invariant convergence guarantees aligning with the inherent affine invariance of the BFGS method.
