Table of Contents
Fetching ...
Paper

Stochastic Bilevel Optimization with Heavy-Tailed Noise

Abstract

This paper considers the smooth bilevel optimization in which the lower-level problem is strongly convex and the upper-level problem is possibly nonconvex. We focus on the stochastic setting where the algorithm can access the unbiased stochastic gradient evaluation with heavy-tailed noise, which is prevalent in many machine learning applications, such as training large language models and reinforcement learning. We propose a nested-loop normalized stochastic bilevel approximation (NSBA) for finding an -stationary point with the stochastic first-order oracle (SFO) complexity of , where is the condition number, is the order of central moment for the noise, and is the noise level. Furthermore, we specialize our idea to solve the nonconvex-strongly-concave minimax optimization problem, achieving an -stationary point with the SFO complexity of~. All the above upper bounds match the best-known results under the special case of the bounded variance setting, i.e., . We also conduct the numerical experiments to show the empirical superiority of the proposed methods.