Table of Contents
Fetching ...

Some smooth divergences for $\ell_{1}-$approximations

Pierre Bertrand, Wolfgang Stummer

TL;DR

The paper develops smooth approximations to the weighted ℓ1 distance and norm via generalized φ-divergences and a new scaled shift-divergence. It introduces the smooth generator φ_{α,β,ĉ} and shows that, under appropriate asymptotics (e.g., α→0+ or α/β→0+), the divergences D_{φ} and D_{φ, P, σ}^{new} converge to weighted or unweighted ℓ1 distances, enabling differentiable surrogates for sparsity-inducing terms. The work also provides concrete limit proofs and visualizations in a LASSO-like setting, illustrating how the smoothed divergences replicate the ℓ1 behavior while preserving smooth optimization properties. These results offer practical, theory-backed tools for smoothly approximating ℓ1 penalties in high-dimensional estimation and related applications.

Abstract

For some smooth special case of generalized $\varphi-$divergences as well as of new divergences (called scaled shift divergences), we derive approximations of the omnipresent (weighted) $\ell_{1}-$distance and (weighted) $\ell_{1}-$norm.

Some smooth divergences for $\ell_{1}-$approximations

TL;DR

The paper develops smooth approximations to the weighted ℓ1 distance and norm via generalized φ-divergences and a new scaled shift-divergence. It introduces the smooth generator φ_{α,β,ĉ} and shows that, under appropriate asymptotics (e.g., α→0+ or α/β→0+), the divergences D_{φ} and D_{φ, P, σ}^{new} converge to weighted or unweighted ℓ1 distances, enabling differentiable surrogates for sparsity-inducing terms. The work also provides concrete limit proofs and visualizations in a LASSO-like setting, illustrating how the smoothed divergences replicate the ℓ1 behavior while preserving smooth optimization properties. These results offer practical, theory-backed tools for smoothly approximating ℓ1 penalties in high-dimensional estimation and related applications.

Abstract

For some smooth special case of generalized divergences as well as of new divergences (called scaled shift divergences), we derive approximations of the omnipresent (weighted) distance and (weighted) norm.

Paper Structure

This paper contains 5 sections, 2 theorems, 29 equations, 1 figure.

Key Result

proposition thmcounterproposition

(a) For all $t \in \, ]-\infty,\infty[$, $\beta \in \, ]0,\infty[$ and $\widetilde{c} \in \, ]0,\infty[$ there holds (b) For all $t \in \, ]-\infty,\infty[$, $\alpha \in \, ]0,\infty[$ and $\beta \in \, ]0,\infty[$ there holds (c) For all $\beta \in \, ]0,\infty[$, $\widetilde{c} \in \, ]0,\infty[$, $\mathbf{Q} \in \mathbb{R}^{K}$ and $\mathbf{P} \in \mathbb{R}_{>0}^{K}$ there holds (d) For all

Figures (1)

  • Figure 1:

Theorems & Definitions (2)

  • proposition thmcounterproposition
  • proposition thmcounterproposition