Table of Contents
Fetching ...

AS-BOX: Additional Sampling Method for Weighted Sum Problems with Box Constraints

Nataša Krejić, Nataša Krklec Jerinkić, Tijana Ostojić, Nemanja Vučićević

TL;DR

AS-BOX addresses large-scale box-constrained finite-sum optimization by integrating projected gradient steps with adaptive variable sample sizes and a nonmonotone line search. The key novelty is additional sampling: a modest, per-iteration sample $D_k$ monitors the progress and guides when to increase the main sample size $N_k$, ensuring convergence with controlled computational cost. Theoretical results establish almost-sure convergence to stationary points and complexity bounds, with stronger guarantees in the strongly convex case, while numerical experiments on logistic regression and neural networks demonstrate superior practical efficiency and robustness compared to SIPM and PSGM. The method adaptively balances gradient accuracy and computational effort, often avoiding full-sample evaluations yet achieving fast convergence and high-quality stationarity, making it suitable for large-scale, constrained learning problems.

Abstract

A class of optimization problems characterized by a weighted finite-sum objective function subject to box constraints is considered. We propose a novel stochastic optimization method, named AS-BOX (\text{A}ddi\-ti\-onal \text{S}ampling for \text{BOX} constraints), that combines projected gradient directions with adaptive variable sample size strategies and nonmonotone line search. The method dynamically adjusts the batch size based on progress with respect to the additional sampling function and on structural consistency of the projected direction, enabling practical adaptivity of AS-BOX, while ensuring theoretical support. We establish almost sure convergence under standard assumptions and provide complexity bounds. Numerical experiments demonstrate the efficiency and competitiveness of the proposed method compared to state-of-the-art algorithms.

AS-BOX: Additional Sampling Method for Weighted Sum Problems with Box Constraints

TL;DR

AS-BOX addresses large-scale box-constrained finite-sum optimization by integrating projected gradient steps with adaptive variable sample sizes and a nonmonotone line search. The key novelty is additional sampling: a modest, per-iteration sample monitors the progress and guides when to increase the main sample size , ensuring convergence with controlled computational cost. Theoretical results establish almost-sure convergence to stationary points and complexity bounds, with stronger guarantees in the strongly convex case, while numerical experiments on logistic regression and neural networks demonstrate superior practical efficiency and robustness compared to SIPM and PSGM. The method adaptively balances gradient accuracy and computational effort, often avoiding full-sample evaluations yet achieving fast convergence and high-quality stationarity, making it suitable for large-scale, constrained learning problems.

Abstract

A class of optimization problems characterized by a weighted finite-sum objective function subject to box constraints is considered. We propose a novel stochastic optimization method, named AS-BOX (\text{A}ddi\-ti\-onal \text{S}ampling for \text{BOX} constraints), that combines projected gradient directions with adaptive variable sample size strategies and nonmonotone line search. The method dynamically adjusts the batch size based on progress with respect to the additional sampling function and on structural consistency of the projected direction, enabling practical adaptivity of AS-BOX, while ensuring theoretical support. We establish almost sure convergence under standard assumptions and provide complexity bounds. Numerical experiments demonstrate the efficiency and competitiveness of the proposed method compared to state-of-the-art algorithms.

Paper Structure

This paper contains 12 sections, 11 theorems, 79 equations, 5 figures.

Key Result

Theorem 2.1

Birgin1 Assume that $f \in C^1(S_k)$ and $x \in S$. Then the projected gradient direction pgd satisfies:

Figures (5)

  • Figure 1: Distance to the solution $||x_k - x^*||$ versus $FEV_k$ for logistic regression on the Mushrooms dataset.
  • Figure 2: Distance to the solution $||x_k - x^*||$ versus $FEV_k$ for logistic regression on the IJCNN1 dataset.
  • Figure 3: AS-BOX: evolution of the subsample size $N_k$ as a function of $FEV_k$. Part a): $N_k$ versus $FEV_k$ for the Mushrooms dataset. Part b): $N_k$ versus $FEV_k$ for the IJCNN1 dataset.
  • Figure 4: Part a): Cross-entropy loss versus $FEV_k$ for the Mushrooms dataset. Part b): Stationarity measure $\|d(x_k)\|$ versus $FEV_k$ for the Mushrooms dataset.
  • Figure 5: Part a): Cross-entropy loss versus $FEV_k$ for the IJCNN1 dataset. Part b): Stationarity measure $\|d(x_k)\|$ versus $FEV_k$ for the IJCNN1 dataset.

Theorems & Definitions (16)

  • Theorem 2.1
  • Lemma 3.1
  • Lemma 3.2
  • Theorem 3.3
  • proof
  • Theorem 3.4
  • proof
  • Theorem 3.5
  • proof
  • Theorem 3.6
  • ...and 6 more