Gradient sampling algorithm for subsmooth functions
Dimitris Boskos, Jorge Cortés, Sonia Martínez
TL;DR
The paper studies non-smooth optimization where the objective is the max over a parameterized family, $f(x)=\max_{\theta\in\Theta}F(x,\theta)$, with inner maximization preventing closed-form objective values and gradients.It extends gradient sampling to a modified gradient sampling (mGS) framework that uses an oracle to approximate inner maximizers and samples nearby gradients to form a descent direction, proving almost-sure convergence to Clarke stationary points under the weaker assumption that $f$ is lower-$\mathcal{C}^2$ (subsmooth) on an open full-measure set.The authors provide convergence proofs, invariance guarantees to convex domains, and a distributionally robust coverage optimization example showing that the objective is lower-$\mathcal{C}^2$ and that iterates can be guided toward a desired convex set without adding hard constraints.Numerical experiments demonstrate robustness to density uncertainty, with two- and multi-agent setups showing convergence where standard gradient methods may fail, and a penalty term ensuring attractivity inside a convex region.
Abstract
This paper considers non-smooth optimization problems where we seek to minimize the pointwise maximum of a continuously parameterized family of functions. Since the objective function is given as the solution to a maximization problem, neither its values nor its gradients are available in closed form, which calls for approximation. Our approach hinges upon extending the so-called gradient sampling algorithm, which approximates the Clarke generalized gradient of the objective function at a point by sampling its derivative at nearby locations. This allows us to select descent directions around points where the function may fail to be differentiable and establish algorithm convergence to a stationary point from any initial condition. Our key contribution is to prove this convergence by alleviating the requirement on continuous differentiability of the objective function on an open set of full measure. We further provide assumptions under which a desired convex subset of the decision space is rendered attractive for the iterates of the algorithm.
