Table of Contents
Fetching ...

Effective regions and kernels in continuous sparse regularisation, with application to sketched mixtures

Yohann De Castro, Rémi Gribonval, Nicolas Jouvin

TL;DR

The paper advances off-grid sparse recovery by extending BLASSO on measures through a kernel-switch framework and effective near-region analysis. It proves the sinc-4 pivot kernel satisfies the local positive curvature condition and leverages an embedding constant to transfer guarantees to a broad class of model and sketched kernels, enabling tractable analysis with sketching. A key practical contribution is the S2Mix method for sketched mixture modelling, which achieves near-optimal parameter recovery with sketch sizes $m = O(s_0 \log^2(s_0))$ and adapts the localisation radius $r$ to the noise level $\gamma$, yielding near-optimal rates as sample size grows. The results generalise BLASSO guarantees to translation-invariant kernels and smoothing schemes beyond Gaussian templates, with near-minimizer solutions inheriting the same guarantees, thus enhancing both theoretical understanding and computational efficiency in continuous sparse regression for mixtures.

Abstract

This paper advances the general theory of continuous sparse regularisation on measures with the Beurling-LASSO (BLASSO). This TV-regularised convex program on the space of measures allows to recover a sparse measure using a noisy observation from a measurement operator. While previous works have uncovered the central role played by this operator and its associated kernel in order to get estimation error bounds, the latter requires a technical local positive curvature (LPC) assumption to be verified on a case-by-case basis. In practice, this yields only few LPC-kernels for which this condition is proved. In this paper, we prove that the ``sinc-4'' kernel, used for signal recovery and mixture problems, does satisfy the LPC assumption. Furthermore, we introduce the kernel switch analysis, which allows to leverage on a known LPC-kernel as a pivot kernel to prove error bounds. Together, these results provide easy-to-check conditions to get error bounds for a large family of translation-invariant model kernels. Besides, we also show that known BLASSO guarantees can be made adaptive to the noise level. This improves on known results where this error is fixed with some parameters depending on the model kernel. We illustrate the interest of our results in the case of mixture model estimation, using band-limiting smoothing and sketching techniques to reduce the computational burden of BLASSO.

Effective regions and kernels in continuous sparse regularisation, with application to sketched mixtures

TL;DR

The paper advances off-grid sparse recovery by extending BLASSO on measures through a kernel-switch framework and effective near-region analysis. It proves the sinc-4 pivot kernel satisfies the local positive curvature condition and leverages an embedding constant to transfer guarantees to a broad class of model and sketched kernels, enabling tractable analysis with sketching. A key practical contribution is the S2Mix method for sketched mixture modelling, which achieves near-optimal parameter recovery with sketch sizes and adapts the localisation radius to the noise level , yielding near-optimal rates as sample size grows. The results generalise BLASSO guarantees to translation-invariant kernels and smoothing schemes beyond Gaussian templates, with near-minimizer solutions inheriting the same guarantees, thus enhancing both theoretical understanding and computational efficiency in continuous sparse regression for mixtures.

Abstract

This paper advances the general theory of continuous sparse regularisation on measures with the Beurling-LASSO (BLASSO). This TV-regularised convex program on the space of measures allows to recover a sparse measure using a noisy observation from a measurement operator. While previous works have uncovered the central role played by this operator and its associated kernel in order to get estimation error bounds, the latter requires a technical local positive curvature (LPC) assumption to be verified on a case-by-case basis. In practice, this yields only few LPC-kernels for which this condition is proved. In this paper, we prove that the ``sinc-4'' kernel, used for signal recovery and mixture problems, does satisfy the LPC assumption. Furthermore, we introduce the kernel switch analysis, which allows to leverage on a known LPC-kernel as a pivot kernel to prove error bounds. Together, these results provide easy-to-check conditions to get error bounds for a large family of translation-invariant model kernels. Besides, we also show that known BLASSO guarantees can be made adaptive to the noise level. This improves on known results where this error is fixed with some parameters depending on the model kernel. We illustrate the interest of our results in the case of mixture model estimation, using band-limiting smoothing and sketching techniques to reduce the computational burden of BLASSO.

Paper Structure

This paper contains 61 sections, 22 theorems, 177 equations, 3 figures, 2 tables.

Key Result

Lemma 3.1

Let $\alpha > 0$. Then with probability at least $1 - \alpha$, it holds that where $C_\alpha \mathrel{\vcenter{\hbox{\scriptsize.}\hbox{\scriptsize.}}} = 2\sqrt{1 + C_1 \log({C_2}/{\alpha})}$ is a constant only depending on $\alpha$, while $C_1$ and $C_2$ are universal constants.

Figures (3)

  • Figure 1: Illustration of the kernel switch principle. Left: A small set of pivot kernels known to satisfy the local positive curvature assumption ($\mathop{\mathrm{LPC}}\nolimits$) enables theoretical guarantees for a much larger class of model kernels. Right: A model kernel $K_{\textnormal{mod}}$ is admissible for our statistical guarantees if there exists a pivot LPC kernel (here $K_{\textnormal{sinc}}$) whose RKHS is continuously embedded into the model RKHS. The error bounds remain valid with an additional scaling factor $C_{\textnormal{switch}}(K_\textnormal{mod}, K_{\textnormal{sinc}})$. In this example, using the sinc kernel as pivot is valid because its RKHS is included in the model RKHS, while the Gaussian kernel cannot serve as a pivot since its RKHS contains functions with faster-decaying Fourier transforms than the model allows.
  • Figure 2: Two-dimensional illustration of near (light marroon) and far (light gray) regions with fixed radius $r$ and a distance ${\mathfrak{d}}(\bs,\bt) \propto \Vert \bs - \bt \Vert^2$.
  • Figure 3: On the left, a one-dimensional illustration of the non-degenerate dual certificate $\eta^0$ in the near regions and far regions for some target $\mu^0$ with two spikes $x_1$ and $x_2$, shown in light green. On the right, a zoomed location of the curvature control in the near region $\mathcal{N}^{\textnormal{reg}}_1(r)$ around $t_1$. The certificates are drawn in orange while blue dotted lines illustrates the required control.

Theorems & Definitions (57)

  • Definition 1: Model kernel
  • Definition 2: Sparse models
  • Remark 2.1
  • Definition 3: Far and Near regions
  • Remark 2.2: On regularisation parameter calibration
  • Lemma 3.1: Control of the noise level $\gamma_n$
  • proof
  • Proposition 3.1: Statistical guarantees for the Supermix problem
  • Remark 3.1: Notational comparison with SuperMix
  • Remark 3.2: Model set with sinc-4 pivot
  • ...and 47 more