Optimal Approximation -- Smoothness Tradeoffs for Soft-Max Functions
Alessandro Epasto, Mohammad Mahdian, Vahab Mirrokni, Manolis Zampetakis
TL;DR
This work formalizes the tradeoff between approximation quality and smoothness for soft-max functions and introduces two novel mechanisms that optimize different aspects of this tradeoff. PLSoftMax achieves worst-case additive approximation with favorable ℓ_p–ℓ_q Lipschitz smoothness, producing sparse outputs that are beneficial for learning and mechanisms. The Power mechanism enables multiplicative (log-domain) approximations and Rényi-divergence smoothness, improving differentially private submodular optimization and related tasks. Together with rigorous lower bounds and construction details, the paper provides a unified framework linking soft-max design to mechanism design, privacy, and sparse classification, with concrete theoretical guarantees and practical implications.
Abstract
A soft-max function has two main efficiency measures: (1) approximation - which corresponds to how well it approximates the maximum function, (2) smoothness - which shows how sensitive it is to changes of its input. Our goal is to identify the optimal approximation-smoothness tradeoffs for different measures of approximation and smoothness. This leads to novel soft-max functions, each of which is optimal for a different application. The most commonly used soft-max function, called exponential mechanism, has optimal tradeoff between approximation measured in terms of expected additive approximation and smoothness measured with respect to Rényi Divergence. We introduce a soft-max function, called "piecewise linear soft-max", with optimal tradeoff between approximation, measured in terms of worst-case additive approximation and smoothness, measured with respect to $\ell_q$-norm. The worst-case approximation guarantee of the piecewise linear mechanism enforces sparsity in the output of our soft-max function, a property that is known to be important in Machine Learning applications [Martins et al. '16, Laha et al. '18] and is not satisfied by the exponential mechanism. Moreover, the $\ell_q$-smoothness is suitable for applications in Mechanism Design and Game Theory where the piecewise linear mechanism outperforms the exponential mechanism. Finally, we investigate another soft-max function, called power mechanism, with optimal tradeoff between expected \textit{multiplicative} approximation and smoothness with respect to the Rényi Divergence, which provides improved theoretical and practical results in differentially private submodular optimization.
