Table of Contents
Fetching ...

Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization

Wei Jiang, Sifan Yang, Wenhao Yang, Yibo Wang, Yuanyu Wan, Lijun Zhang

TL;DR

This work advances projection-free optimization for stochastic constrained multi-level compositional problems by introducing PMVR and PMVR-v2, which couple STORM-based variance reduction with Frank-Wolfe updates. The authors provide theoretical guarantees under three criteria—Frank-Wolfe gap, gradient mapping, and optimal gap—and extend the analysis to convex and strongly convex objectives via a stage-wise warm-start framework. The results show improved sample complexities, including $O(\varepsilon^{-1.5})$ for gradient-mapping and stage-wise convext/strongly convex rates, with large-batch strategies trading off SFO/LMO costs. Empirical evaluations on matrix optimization and risk-averse portfolio tasks demonstrate faster convergence of PMVR methods compared with existing projection-free baselines, highlighting practical impact in constrained multi-level learning settings.

Abstract

This paper investigates projection-free algorithms for stochastic constrained multi-level optimization. In this context, the objective function is a nested composition of several smooth functions, and the decision set is closed and convex. Existing projection-free algorithms for solving this problem suffer from two limitations: 1) they solely focus on the gradient mapping criterion and fail to match the optimal sample complexities in unconstrained settings; 2) their analysis is exclusively applicable to non-convex functions, without considering convex and strongly convex objectives. To address these issues, we introduce novel projection-free variance reduction algorithms and analyze their complexities under different criteria. For gradient mapping, our complexities improve existing results and match the optimal rates for unconstrained problems. For the widely-used Frank-Wolfe gap criterion, we provide theoretical guarantees that align with those for single-level problems. Additionally, by using a stage-wise adaptation, we further obtain complexities for convex and strongly convex functions. Finally, numerical experiments on different tasks demonstrate the effectiveness of our methods.

Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization

TL;DR

This work advances projection-free optimization for stochastic constrained multi-level compositional problems by introducing PMVR and PMVR-v2, which couple STORM-based variance reduction with Frank-Wolfe updates. The authors provide theoretical guarantees under three criteria—Frank-Wolfe gap, gradient mapping, and optimal gap—and extend the analysis to convex and strongly convex objectives via a stage-wise warm-start framework. The results show improved sample complexities, including for gradient-mapping and stage-wise convext/strongly convex rates, with large-batch strategies trading off SFO/LMO costs. Empirical evaluations on matrix optimization and risk-averse portfolio tasks demonstrate faster convergence of PMVR methods compared with existing projection-free baselines, highlighting practical impact in constrained multi-level learning settings.

Abstract

This paper investigates projection-free algorithms for stochastic constrained multi-level optimization. In this context, the objective function is a nested composition of several smooth functions, and the decision set is closed and convex. Existing projection-free algorithms for solving this problem suffer from two limitations: 1) they solely focus on the gradient mapping criterion and fail to match the optimal sample complexities in unconstrained settings; 2) their analysis is exclusively applicable to non-convex functions, without considering convex and strongly convex objectives. To address these issues, we introduce novel projection-free variance reduction algorithms and analyze their complexities under different criteria. For gradient mapping, our complexities improve existing results and match the optimal rates for unconstrained problems. For the widely-used Frank-Wolfe gap criterion, we provide theoretical guarantees that align with those for single-level problems. Additionally, by using a stage-wise adaptation, we further obtain complexities for convex and strongly convex functions. Finally, numerical experiments on different tasks demonstrate the effectiveness of our methods.
Paper Structure (22 sections, 13 theorems, 82 equations, 4 figures, 1 table, 4 algorithms)

This paper contains 22 sections, 13 theorems, 82 equations, 4 figures, 1 table, 4 algorithms.

Key Result

Theorem 1

By setting $B_1 = \mathcal{O}(1)$, $\eta = \mathcal{O}(\epsilon^2)$, and $\alpha = \mathcal{O}(\epsilon^2)$, our PMVR algorithm guarantees that $\mathbb{E}\left[\mathcal{F}(\mathbf{x}_\tau)\right] \leq \epsilon$ within $T = \mathcal{O}(\epsilon^{-3})$ iterations.

Figures (4)

  • Figure 1: Results for matrix optimization with low-rank constraints.
  • Figure 2: Results for risk-averse portfolio optimization.
  • Figure 3: Results for risk-averse portfolio optimization.
  • Figure 4: Results for risk-averse portfolio optimization.

Theorems & Definitions (22)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Definition 1
  • Definition 2
  • Theorem 5
  • Theorem 6
  • Theorem 7
  • Theorem 8
  • ...and 12 more