Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization

Wei Jiang; Sifan Yang; Wenhao Yang; Yibo Wang; Yuanyu Wan; Lijun Zhang

Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization

Wei Jiang, Sifan Yang, Wenhao Yang, Yibo Wang, Yuanyu Wan, Lijun Zhang

TL;DR

This work advances projection-free optimization for stochastic constrained multi-level compositional problems by introducing PMVR and PMVR-v2, which couple STORM-based variance reduction with Frank-Wolfe updates. The authors provide theoretical guarantees under three criteria—Frank-Wolfe gap, gradient mapping, and optimal gap—and extend the analysis to convex and strongly convex objectives via a stage-wise warm-start framework. The results show improved sample complexities, including $O(\varepsilon^{-1.5})$ for gradient-mapping and stage-wise convext/strongly convex rates, with large-batch strategies trading off SFO/LMO costs. Empirical evaluations on matrix optimization and risk-averse portfolio tasks demonstrate faster convergence of PMVR methods compared with existing projection-free baselines, highlighting practical impact in constrained multi-level learning settings.

Abstract

This paper investigates projection-free algorithms for stochastic constrained multi-level optimization. In this context, the objective function is a nested composition of several smooth functions, and the decision set is closed and convex. Existing projection-free algorithms for solving this problem suffer from two limitations: 1) they solely focus on the gradient mapping criterion and fail to match the optimal sample complexities in unconstrained settings; 2) their analysis is exclusively applicable to non-convex functions, without considering convex and strongly convex objectives. To address these issues, we introduce novel projection-free variance reduction algorithms and analyze their complexities under different criteria. For gradient mapping, our complexities improve existing results and match the optimal rates for unconstrained problems. For the widely-used Frank-Wolfe gap criterion, we provide theoretical guarantees that align with those for single-level problems. Additionally, by using a stage-wise adaptation, we further obtain complexities for convex and strongly convex functions. Finally, numerical experiments on different tasks demonstrate the effectiveness of our methods.

Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization

TL;DR

for gradient-mapping and stage-wise convext/strongly convex rates, with large-batch strategies trading off SFO/LMO costs. Empirical evaluations on matrix optimization and risk-averse portfolio tasks demonstrate faster convergence of PMVR methods compared with existing projection-free baselines, highlighting practical impact in constrained multi-level learning settings.

Abstract

Paper Structure (22 sections, 13 theorems, 82 equations, 4 figures, 1 table, 4 algorithms)

This paper contains 22 sections, 13 theorems, 82 equations, 4 figures, 1 table, 4 algorithms.

Introduction
Related Work
Stochastic Multi-Level Compositional Optimization
Stochastic Projection-Free Algorithms
The Proposed Methods
Assumptions
Results for Frank-Wolfe Gap
Results for Gradient Mapping
Results for Optimal Gap
Convex functions:
Strongly convex functions:
Experiments
Matrix Optimization with Low-Rank Constraints
Mean-variance Risk-averse Optimization
Mean-deviation Risk-averse Optimization
...and 7 more sections

Key Result

Theorem 1

By setting $B_1 = \mathcal{O}(1)$, $\eta = \mathcal{O}(\epsilon^2)$, and $\alpha = \mathcal{O}(\epsilon^2)$, our PMVR algorithm guarantees that $\mathbb{E}\left[\mathcal{F}(\mathbf{x}_\tau)\right] \leq \epsilon$ within $T = \mathcal{O}(\epsilon^{-3})$ iterations.

Figures (4)

Figure 1: Results for matrix optimization with low-rank constraints.
Figure 2: Results for risk-averse portfolio optimization.
Figure 3: Results for risk-averse portfolio optimization.
Figure 4: Results for risk-averse portfolio optimization.

Theorems & Definitions (22)

Theorem 1
Theorem 2
Theorem 3
Theorem 4
Definition 1
Definition 2
Theorem 5
Theorem 6
Theorem 7
Theorem 8
...and 12 more

Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization

TL;DR

Abstract

Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (22)