Table of Contents
Fetching ...

An hybrid stochastic Newton algorithm for logistic regression

Bernard Bercu, Luis Fredes, Eméric Gbaguidi

TL;DR

Frames the stochastic optimization problem for logistic regression and introduces a hybrid stochastic Newton algorithm that blends two Hessian-weighting components. It updates the Hessian before the parameter and uses the Sherman-Morrison-Woodbury formula to maintain efficiency, avoiding truncation. The work proves almost-sure convergence, fast convergence rates for the Hessian estimates, a central limit theorem with asymptotic efficiency, and a quadratic strong law for cumulative excess risk. Empirical results on synthetic and real data show competitive performance relative to existing second-order methods, with robustness to high dimensionality.

Abstract

In this paper, we investigate a second-order stochastic algorithm for solving large-scale binary classification problems. We propose to make use of a new hybrid stochastic Newton algorithm that includes two weighted components in the Hessian matrix estimation: the first one coming from the natural Hessian estimate and the second associated with the stochastic gradient information. Our motivation comes from the fact that both parts evaluated at the true parameter of logistic regression, are equal to the Hessian matrix. This new formulation has several advantages and it enables us to prove the almost sure convergence of our stochastic algorithm to the true parameter. Moreover, we significantly improve the almost sure rate of convergence to the Hessian matrix. Furthermore, we establish the central limit theorem for our hybrid stochastic Newton algorithm. Finally, we show a surprising result on the almost sure convergence of the cumulative excess risk.

An hybrid stochastic Newton algorithm for logistic regression

TL;DR

Frames the stochastic optimization problem for logistic regression and introduces a hybrid stochastic Newton algorithm that blends two Hessian-weighting components. It updates the Hessian before the parameter and uses the Sherman-Morrison-Woodbury formula to maintain efficiency, avoiding truncation. The work proves almost-sure convergence, fast convergence rates for the Hessian estimates, a central limit theorem with asymptotic efficiency, and a quadratic strong law for cumulative excess risk. Empirical results on synthetic and real data show competitive performance relative to existing second-order methods, with robustness to high dimensionality.

Abstract

In this paper, we investigate a second-order stochastic algorithm for solving large-scale binary classification problems. We propose to make use of a new hybrid stochastic Newton algorithm that includes two weighted components in the Hessian matrix estimation: the first one coming from the natural Hessian estimate and the second associated with the stochastic gradient information. Our motivation comes from the fact that both parts evaluated at the true parameter of logistic regression, are equal to the Hessian matrix. This new formulation has several advantages and it enables us to prove the almost sure convergence of our stochastic algorithm to the true parameter. Moreover, we significantly improve the almost sure rate of convergence to the Hessian matrix. Furthermore, we establish the central limit theorem for our hybrid stochastic Newton algorithm. Finally, we show a surprising result on the almost sure convergence of the cumulative excess risk.

Paper Structure

This paper contains 11 sections, 4 theorems, 155 equations, 2 figures, 1 table.

Key Result

Theorem 3.1

Assume that Assumptions sna_cond1 and sna_cond2 hold. Then, we have the following almost sure convergences

Figures (2)

  • Figure 1: Evolution of the mean squared error with respect to the iteration.
  • Figure 2: Evolution of the expected excess risk evaluated on the test set with respect to the iteration. We observe that our hybrid stochastic Newton and the SN algorithms are very close in both real-world datasets.

Theorems & Definitions (12)

  • Theorem 3.1
  • proof
  • Theorem 3.2
  • proof
  • Theorem 3.3
  • proof
  • Theorem 3.4
  • proof
  • proof
  • proof
  • ...and 2 more