Instance-optimal stochastic convex optimization: Can we improve upon sample-average and robust stochastic approximation?

Liwei Jiang; Ashwin Pananjady

Instance-optimal stochastic convex optimization: Can we improve upon sample-average and robust stochastic approximation?

Liwei Jiang, Ashwin Pananjady

Abstract

We study the unconstrained minimization of a smooth and strongly convex population loss function under a stochastic oracle that introduces both additive and multiplicative noise; this is a canonical and widely-studied setting that arises across operations research, signal processing, and machine learning. We begin by showing that standard approaches such as sample average approximation and robust (or averaged) stochastic approximation can lead to suboptimal -- and in some cases arbitrarily poor -- performance with realistic finite sample sizes. In contrast, we demonstrate that a carefully designed variance reduction strategy, which we term VISOR for short, can significantly outperform these approaches while using the same sample size. Our upper bounds are complemented by finite-sample, information-theoretic local minimax lower bounds, which highlight fundamental, instance-dependent factors that govern the performance of any estimator. Taken together, these results demonstrate that an accelerated variant of VISOR is instance-optimal, achieving the best possible sample complexity up to logarithmic factors while also attaining optimal oracle complexity. We apply our theory to generalized linear models and improve upon classical results. In particular, we obtain the best-known non-asymptotic, instance-dependent generalization error bounds for stochastic methods, even in linear regression.

Instance-optimal stochastic convex optimization: Can we improve upon sample-average and robust stochastic approximation?

Abstract

Instance-optimal stochastic convex optimization: Can we improve upon sample-average and robust stochastic approximation?

Abstract

Paper Structure

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (58)