Deterministic and Stochastic Frank-Wolfe Recursion on Probability Spaces
Di Yu, Shane G. Henderson, Raghu Pasupathy
TL;DR
This work develops deterministic and stochastic Frank-Wolfe recursions for optimization over probability measures, using the influence function as the central first-order object. A key result is that the FW subproblem admits a closed-form solution as a Dirac measure at the minimizer of the influence function, enabling efficient particle-update schemes. The dFW method achieves $O(k^{-1})$ convergence under convex, $L$-smooth objectives, while sFW attains $O(k^{-1})$ in expectation and $O(k^{-1/2})$ in Frank-Wolfe gap for nonconvex objectives, with a fixed-step fixed-sample variant yielding exponential convergence; a central limit theorem is provided for observed objective values. The paper also presents a broad set of examples (calibration, optimal design, P-means, neural networks, CRE, Gaussian deconvolution) to illustrate the computation and behavior of the influence function and the proposed recursions, and discusses connections to particle-based methods and potential future extensions.
Abstract
Motivated by applications in emergency response and experimental design, we consider smooth stochastic optimization problems over probability measures supported on compact subsets of the Euclidean space. With the influence function as the variational object, we construct a deterministic Frank-Wolfe (dFW) recursion for probability spaces, made especially possible by a lemma that identifies a ``closed-form'' solution to the infinite-dimensional Frank-Wolfe sub-problem. Each iterate in dFW is expressed as a convex combination of the incumbent iterate and a Dirac measure concentrating on the minimum of the influence function at the incumbent iterate. To address common application contexts that have access only to Monte Carlo observations of the objective and influence function, we construct a stochastic Frank-Wolfe (sFW) variation that generates a random sequence of probability measures constructed using minima of increasingly accurate estimates of the influence function. We demonstrate that sFW's optimality gap sequence exhibits $O(k^{-1})$ iteration complexity almost surely and in expectation for smooth convex objectives, and $O(k^{-1/2})$ (in Frank-Wolfe gap) for smooth non-convex objectives. Furthermore, we show that an easy-to-implement fixed-step, fixed-sample version of (sFW) exhibits exponential convergence to $\varepsilon$-optimality. We end with a central limit theorem on the observed objective values at the sequence of generated random measures. To further intuition, we include several illustrative examples with exact influence function calculations.
