Complexity of Zeroth- and First-order Stochastic Trust-Region Algorithms
Yunsoo Ha, Sara Shashaani, Raghu Pasupathy
TL;DR
The paper investigates how Common Random Numbers (CRN) influence the sample and iteration complexity of zeroth- and first-order stochastic trust-region algorithms (ASTRO(-DF)). By analyzing MU and CE steps under CRN across zeroth- and first-order oracles and varying sample-path regularity, the authors derive complexity landscapes: without CRN, the rate is $ ilde{O}(ε^{-6})$ across cases, while CRN can yield dramatic improvements, up to $ ilde{O}(ε^{-2})$ a.s. in the first-order, smooth-path setting, and favorable reductions to $ ilde{O}(ε^{-5})$ or $ ilde{O}(ε^{-4})$ in other structured contexts. The improvements are largely attributed to general variance-reduction mechanisms, such as finite-difference error control and sample-path smoothness, rather than algorithmic specifics. The work provides a rigorous balance-condition-based analysis, strong consistency proofs, and detailed, case-dependent complexity results, with broader implications for the design of CRN-enabled stochastic TR methods in various domains.
Abstract
Model update (MU) and candidate evaluation (CE) are classical steps incorporated inside many stochastic trust-region (TR) algorithms. The sampling effort exerted within these steps, often decided with the aim of controlling model error, largely determines a stochastic TR algorithm's sample complexity. Given that MU and CE are amenable to variance reduction, we investigate the effect of incorporating common random numbers (CRN) within MU and CE on complexity. Using ASTRO and ASTRO-DF as prototype first-order and zeroth-order families of algorithms, we demonstrate that CRN's effectiveness leads to a range of complexities depending on sample-path regularity and the oracle order. For instance, we find that in first-order oracle settings with smooth sample paths, CRN's effect is pronounced -- ASTRO with CRN achieves $\tilde{O}(ε^{-2})$ a.s. sample complexity compared to $\tilde{O}(ε^{-6})$ a.s. in the generic no-CRN setting. By contrast, CRN's effect is muted when the sample paths are not Lipschitz, with the sample complexity improving from $\tilde{O}(ε^{-6})$ a.s. to $\tilde{O}(ε^{-5})$ and $\tilde{O}(ε^{-4})$ a.s. in the zeroth- and first-order settings, respectively. Since our results imply that improvements in complexity are largely inherited from generic aspects of variance reduction, e.g., finite-differencing for zeroth-order settings and sample-path smoothness for first-order settings within MU, we anticipate similar trends in other contexts.
