Table of Contents
Fetching ...

Optimal Approximation -- Smoothness Tradeoffs for Soft-Max Functions

Alessandro Epasto, Mohammad Mahdian, Vahab Mirrokni, Manolis Zampetakis

TL;DR

This work formalizes the tradeoff between approximation quality and smoothness for soft-max functions and introduces two novel mechanisms that optimize different aspects of this tradeoff. PLSoftMax achieves worst-case additive approximation with favorable ℓ_p–ℓ_q Lipschitz smoothness, producing sparse outputs that are beneficial for learning and mechanisms. The Power mechanism enables multiplicative (log-domain) approximations and Rényi-divergence smoothness, improving differentially private submodular optimization and related tasks. Together with rigorous lower bounds and construction details, the paper provides a unified framework linking soft-max design to mechanism design, privacy, and sparse classification, with concrete theoretical guarantees and practical implications.

Abstract

A soft-max function has two main efficiency measures: (1) approximation - which corresponds to how well it approximates the maximum function, (2) smoothness - which shows how sensitive it is to changes of its input. Our goal is to identify the optimal approximation-smoothness tradeoffs for different measures of approximation and smoothness. This leads to novel soft-max functions, each of which is optimal for a different application. The most commonly used soft-max function, called exponential mechanism, has optimal tradeoff between approximation measured in terms of expected additive approximation and smoothness measured with respect to Rényi Divergence. We introduce a soft-max function, called "piecewise linear soft-max", with optimal tradeoff between approximation, measured in terms of worst-case additive approximation and smoothness, measured with respect to $\ell_q$-norm. The worst-case approximation guarantee of the piecewise linear mechanism enforces sparsity in the output of our soft-max function, a property that is known to be important in Machine Learning applications [Martins et al. '16, Laha et al. '18] and is not satisfied by the exponential mechanism. Moreover, the $\ell_q$-smoothness is suitable for applications in Mechanism Design and Game Theory where the piecewise linear mechanism outperforms the exponential mechanism. Finally, we investigate another soft-max function, called power mechanism, with optimal tradeoff between expected \textit{multiplicative} approximation and smoothness with respect to the Rényi Divergence, which provides improved theoretical and practical results in differentially private submodular optimization.

Optimal Approximation -- Smoothness Tradeoffs for Soft-Max Functions

TL;DR

This work formalizes the tradeoff between approximation quality and smoothness for soft-max functions and introduces two novel mechanisms that optimize different aspects of this tradeoff. PLSoftMax achieves worst-case additive approximation with favorable ℓ_p–ℓ_q Lipschitz smoothness, producing sparse outputs that are beneficial for learning and mechanisms. The Power mechanism enables multiplicative (log-domain) approximations and Rényi-divergence smoothness, improving differentially private submodular optimization and related tasks. Together with rigorous lower bounds and construction details, the paper provides a unified framework linking soft-max design to mechanism design, privacy, and sparse classification, with concrete theoretical guarantees and practical implications.

Abstract

A soft-max function has two main efficiency measures: (1) approximation - which corresponds to how well it approximates the maximum function, (2) smoothness - which shows how sensitive it is to changes of its input. Our goal is to identify the optimal approximation-smoothness tradeoffs for different measures of approximation and smoothness. This leads to novel soft-max functions, each of which is optimal for a different application. The most commonly used soft-max function, called exponential mechanism, has optimal tradeoff between approximation measured in terms of expected additive approximation and smoothness measured with respect to Rényi Divergence. We introduce a soft-max function, called "piecewise linear soft-max", with optimal tradeoff between approximation, measured in terms of worst-case additive approximation and smoothness, measured with respect to -norm. The worst-case approximation guarantee of the piecewise linear mechanism enforces sparsity in the output of our soft-max function, a property that is known to be important in Machine Learning applications [Martins et al. '16, Laha et al. '18] and is not satisfied by the exponential mechanism. Moreover, the -smoothness is suitable for applications in Mechanism Design and Game Theory where the piecewise linear mechanism outperforms the exponential mechanism. Finally, we investigate another soft-max function, called power mechanism, with optimal tradeoff between expected \textit{multiplicative} approximation and smoothness with respect to the Rényi Divergence, which provides improved theoretical and practical results in differentially private submodular optimization.

Paper Structure

This paper contains 33 sections, 26 theorems, 123 equations, 2 figures.

Key Result

Theorem 3.1

For any $\delta > 0$ and $p, \alpha\ge1$, the soft-max function $\textsc{Exp}^\lambda$ with $\lambda=\log(d)/\delta$ satisfies the following: (1) it is $\delta$-approximate, and (2) it is $(\ell_p, D_{\alpha})$-Lipschitz continuous with a Lipschitz less than $2 \lambda$.

Figures (2)

  • Figure 1: Smoothness vs utility in the submodular maximization with cardinality constraint $k=10$. The y-axis shows the ratio of the average objective to the (non-private) greedy algorithm. The x-axis represents the sensitivity to the manipulation test of the value of the first element selected.
  • Figure 2: Robustness vs objective value in the submodular maximization with cardinality constraint $k=10$. The y-axis shows the ration of the average objective obtained vs the (non-private) greedy algorithm. The x-axis represent the sensitivity to the manipulation test of the value of the first element selected.

Theorems & Definitions (48)

  • Theorem 3.1: McSherryT07
  • Theorem 3.2
  • Theorem 3.3
  • Definition 4.1: Soft-Max Matrix
  • Definition 4.2
  • Theorem 4.3
  • Theorem 4.4
  • Theorem 4.5
  • Definition 5.1
  • Proposition 5.2
  • ...and 38 more