Samplet basis pursuit: Multiresolution scattered data approximation with sparsity constraints

Davide Baroli; Helmut Harbrecht; Michael Multerer

Samplet basis pursuit: Multiresolution scattered data approximation with sparsity constraints

Davide Baroli, Helmut Harbrecht, Michael Multerer

TL;DR

This work develops a sparse, multiresolution framework for scattered data by introducing samplets and embedding them into a reproducing kernel Hilbert space via the Riesz isometry. It shows that sparsity in the embedded samplet coordinates is powerful for broad signal classes and derives efficient solvers, including a multiresolution semi-smooth Newton method and a FISTA variant, that outperform single-scale approaches. Numerical experiments on surface reconstruction and large-scale temperature data demonstrate accurate, localized reconstructions with substantial sparsity and scalability, validating the practical impact of the approach. The framework generalizes to multiple kernels and space-time data, enabling robust sparse representations for irregularly sampled data in high dimensions.

Abstract

We consider scattered data approximation in samplet coordinates with $\ell_1$-regularization. The application of an $\ell_1$-regularization term enforces sparsity of the coefficients with respect to the samplet basis. Samplets are wavelet-type signed measures, which are tailored to scattered data. Therefore, samplets enable the use of well-established multiresolution techniques on general scattered data sets. They provide similar properties as wavelets in terms of localization, multiresolution analysis, and data compression. By using the Riesz isometry, we embed samplets into reproducing kernel Hilbert spaces and discuss the properties of the resulting functions. We argue that the class of signals that are sparse with respect to the embedded samplet basis is considerably larger than the class of signals that are sparse with respect to the basis of kernel translates. Vice versa, every signal that is a linear combination of only a few kernel translates is sparse in samplet coordinates. We propose the rapid solution of the problem under consideration by combining soft-shrinkage with the semi-smooth Newton method. Leveraging on the sparse representation of kernel matrices in samplet coordinates, this approach converges faster than the fast iterative shrinkage thresholding algorithm and is feasible for large-scale data. Numerical benchmarks are presented and demonstrate the superiority of the multiresolution approach over the single-scale approach. As large-scale applications, the surface reconstruction from scattered data and the reconstruction of scattered temperature data using a dictionary of multiple kernels are considered.

Samplet basis pursuit: Multiresolution scattered data approximation with sparsity constraints

TL;DR

Abstract

We consider scattered data approximation in samplet coordinates with

-regularization. The application of an

-regularization term enforces sparsity of the coefficients with respect to the samplet basis. Samplets are wavelet-type signed measures, which are tailored to scattered data. Therefore, samplets enable the use of well-established multiresolution techniques on general scattered data sets. They provide similar properties as wavelets in terms of localization, multiresolution analysis, and data compression. By using the Riesz isometry, we embed samplets into reproducing kernel Hilbert spaces and discuss the properties of the resulting functions. We argue that the class of signals that are sparse with respect to the embedded samplet basis is considerably larger than the class of signals that are sparse with respect to the basis of kernel translates. Vice versa, every signal that is a linear combination of only a few kernel translates is sparse in samplet coordinates. We propose the rapid solution of the problem under consideration by combining soft-shrinkage with the semi-smooth Newton method. Leveraging on the sparse representation of kernel matrices in samplet coordinates, this approach converges faster than the fast iterative shrinkage thresholding algorithm and is feasible for large-scale data. Numerical benchmarks are presented and demonstrate the superiority of the multiresolution approach over the single-scale approach. As large-scale applications, the surface reconstruction from scattered data and the reconstruction of scattered temperature data using a dictionary of multiple kernels are considered.

Paper Structure (18 sections, 1 theorem, 58 equations, 5 figures, 1 table, 3 algorithms)

This paper contains 18 sections, 1 theorem, 58 equations, 5 figures, 1 table, 3 algorithms.

Introduction
Problem formulation
Samplets
Samplet bases
Samplets in reproducing kernel Hilbert spaces
Characterization of the dual basis
Sparse kernel interpolants
Scattered data approximation
Kernel ridge regression
Samplet basis pursuit
Algorithms for samplet basis pursuit
Fast iterative-shrinkage thresholding algorithm
Semi-smooth Newton method
Numerical results
Benchmarks
...and 3 more sections

Key Result

Theorem 3.1

The samplet basis $\bigcup_{j=0}^J\{\sigma_{j,k}\}_k$ forms an orthonormal basis in $\mathcal{X}$, satisfying the following properties:

Figures (5)

Figure 1: Data samples from the different data generating functions and reconstructions by the different methods.
Figure 2: Sparsity pattern of the data (coefficients that are larger than 1% of the maximum coefficient are shown) and sparsity patterns of the corresponding solutions visualized by the supports of the active samplets in case of the ridge regression and the SSN.
Figure 3: Surface reconstruction from measurements of the signed distance function. The left panel shows a subsample of the used data, the reconstruction by ridge regression is found in the middle, and the reconstruction by $\ell^1$-regularization is in the right panel.
Figure 4: Clusters containing the active coefficients for the case of the $\ell^1$-regularization.
Figure 5: Sparse reconstruction of temperature data using two kernels. The relative error is shown in the top row, the reconstructed temperature in the second row, the contribution of $k_1$ is seen in the third row and the contribution of $k_2$ in the last row.

Theorems & Definitions (2)

Theorem 3.1
Remark 3.2

Samplet basis pursuit: Multiresolution scattered data approximation with sparsity constraints

TL;DR

Abstract

Samplet basis pursuit: Multiresolution scattered data approximation with sparsity constraints

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (2)