Two-stage stochastic algorithm for solving large-scale (non)-convex separable optimization problems under affine constraints
Benjamin Dubois-Taine, Laurent Pfeiffer, Nadia Oudjane, Adrien Seguret, Francis Bach
TL;DR
The paper tackles large-scale, separable optimization with affine coupling constraints, addressing the prohibitive cost of computing Fenchel conjugates at every iteration. It introduces a two-stage method: a stochastic dual subgradient stage to rapidly approximate the dual optimum, followed by a block-coordinate Frank-Wolfe stage to derive primal solutions from the dual information. In the convex setting, the method achieves an overall conjugate-evaluation complexity of $O\left(\frac{1}{\varepsilon^2}+\frac{N}{\varepsilon^{2/3}}\right)$, significantly beating the $O(\frac{N}{\varepsilon^2})$ baseline. The authors extend the framework to nonconvex component functions, leveraging Shapley-Folkman type bounds and Carathéodory decompositions to preserve convergence guarantees and provide practical reconstruction schemes. Numerical experiments on large-scale data, including EV charging, demonstrate substantial empirical improvements and confirm the predicted convergence behavior.
Abstract
We consider nonsmooth optimization problems under affine constraints, where the objective consists of the average of the component functions of a large number $N$ of agents, and we only assume access to the Fenchel conjugate of the component functions. The algorithm of choice for solving such problems is the dual subgradient method, also known as dual decomposition, which requires $O(\frac{1}{ε^2})$ iterations to reach $ε$-optimality in the convex case. However, each iteration requires computing the Fenchel conjugate of each of the $N$ agents, leading to a complexity $O(\frac{N}{ε^2})$ which might be prohibitive in practical applications. To overcome this, we propose a two-stage algorithm, combining a stochastic subgradient algorithm on the dual problem, followed by a block-coordinate Frank-Wolfe algorithm to obtain primal solutions. The resulting algorithm requires only $O(\frac{1}{ε^2} + \frac{N}{ε^{2/3}})$ calls to Fenchel conjugates to obtain an $ε$-optimal primal solution in expectation in the convex case. We extend our results to nonconvex component functions and show that our method still applies and gets (almost) the same convergence rate, this time only to an approximate primal solution recovering the classical duality gap bounds usually obtained using the Shapley-Folkman theorem.
