TRAFS: A Nonsmooth Convex Optimization Algorithm with $\mathcal{O}\left(\frac{1}ε\right)$ Iteration Complexity
Kai Jia, Martin Rinard
TL;DR
TRAFS targets constrained optimization of convex Lipschitz functions that may be nondifferentiable by leveraging a functional subdifferential to guide descent. It solves a minimax step inside a trust region to obtain a robust descent direction, achieving $O\bigl(\frac{1}{\epsilon}\bigr)$ iterations for Lipschitz objectives and $O\bigl(\frac{1}{\sqrt{\epsilon}}\bigr)$ for strongly convex ones, outperforming classical $O(\epsilon^{-2})$ and $O(\epsilon^{-1})$ bounds. When the functional subdifferential is locally quadratic, TRAFS attains faster convergence, including linear rates for strongly convex smooth functions, and an adaptive variant further improves practical performance. Empirical results show TRAFS significantly outperforms baselines, solving more problems and up to ~39x faster on average, illustrating strong practical impact for nonsmooth convex optimization. The approach is made broadly applicable through compositional rules for functional subdifferentials, enabling efficient minimax solutions across a wide class of nonsmooth problems.
Abstract
We present the Trust Region Adversarial Functional Subdifferential (TRAFS) algorithm for constrained optimization of nonsmooth convex Lipschitz functions. Unlike previous methods that assume a subgradient oracle model, we work with the functional subdifferential defined as a set of subgradients that simultaneously captures sufficient local information for effective minimization while being easy to compute for a wide range of functions. In each iteration, TRAFS finds the best step vector in an $\ell_2$-bounded trust region by considering the worst bound given by the functional subdifferential. TRAFS finds an approximate solution with an absolute error up to $ε$ in $\mathcal{O}\left( ε^{-1}\right)$ or $\mathcal{O}\left(ε^{-0.5} \right)$ iterations depending on whether the objective function is strongly convex, compared to the previously best-known bounds of $\mathcal{O}\left(ε^{-2}\right)$ and $\mathcal{O}\left(ε^{-1}\right)$ in these settings. TRAFS makes faster progress if the functional subdifferential satisfies a locally quadratic property; as a corollary, TRAFS achieves linear convergence (i.e., $\mathcal{O}\left(\log ε^{-1}\right)$) for strongly convex smooth functions. In the numerical experiments, TRAFS is on average 39.1x faster and solves twice as many problems compared to the second-best method.
