Table of Contents
Fetching ...

Randomized Functional Sparse Tucker Tensor for Compression and Fast Visualization of Scientific Data

Prashant Rai, Hemanth Kolla, Lewis Cannada, Alex Gorodetsky

TL;DR

This work introduces the set of functional sparse Tucker tensors and proposes a method to construct approximation in this set such that the resulting compact functional tensor can be rapidly evaluated to recover the original data.

Abstract

We propose a strategy to compress and store large volumes of scientific data represented on unstructured grids. Approaches utilizing tensor decompositions for data compression have already been proposed. Here, data on a structured grid is stored as a tensor which is then subjected to appropriate decomposition in suitable tensor formats. Such decompositions are based on generalization of singular value decomposition to tensors and capture essential features in the data with storage cost lower by orders of magnitude. However, tensor based data compression is limited by the fact that one can only consider scientific data represented on structured grids. In case of data on unstructured meshes, we propose to consider data as realizations of a function that is based on functional view of the tensor thus avoiding such limitations. The key is to efficiently estimate the parameters of the function whose complexity is small compared to the cardinality of the dataset (otherwise there is no compression). Here, we introduce the set of functional sparse Tucker tensors and propose a method to construct approximation in this set such that the resulting compact functional tensor can be rapidly evaluated to recover the original data. The compression procedure consists of three steps. In the first step, we consider a fraction of the original dataset for interpolation on a structured grid followed by sequentially truncated higher order singular value decomposition to get a compressed version of the interpolated data.We then fit singular vectors on a set of functional basis using sparse approximation to obtain corresponding functional sparse Tucker tensor representation. Finally, we re-evaluate the coefficients of this functional tensor using randomized least squares at a reduced computational complexity. This strategy leads to compression ratio of orders of magnitude on combustion simulation datasets.

Randomized Functional Sparse Tucker Tensor for Compression and Fast Visualization of Scientific Data

TL;DR

This work introduces the set of functional sparse Tucker tensors and proposes a method to construct approximation in this set such that the resulting compact functional tensor can be rapidly evaluated to recover the original data.

Abstract

We propose a strategy to compress and store large volumes of scientific data represented on unstructured grids. Approaches utilizing tensor decompositions for data compression have already been proposed. Here, data on a structured grid is stored as a tensor which is then subjected to appropriate decomposition in suitable tensor formats. Such decompositions are based on generalization of singular value decomposition to tensors and capture essential features in the data with storage cost lower by orders of magnitude. However, tensor based data compression is limited by the fact that one can only consider scientific data represented on structured grids. In case of data on unstructured meshes, we propose to consider data as realizations of a function that is based on functional view of the tensor thus avoiding such limitations. The key is to efficiently estimate the parameters of the function whose complexity is small compared to the cardinality of the dataset (otherwise there is no compression). Here, we introduce the set of functional sparse Tucker tensors and propose a method to construct approximation in this set such that the resulting compact functional tensor can be rapidly evaluated to recover the original data. The compression procedure consists of three steps. In the first step, we consider a fraction of the original dataset for interpolation on a structured grid followed by sequentially truncated higher order singular value decomposition to get a compressed version of the interpolated data.We then fit singular vectors on a set of functional basis using sparse approximation to obtain corresponding functional sparse Tucker tensor representation. Finally, we re-evaluate the coefficients of this functional tensor using randomized least squares at a reduced computational complexity. This strategy leads to compression ratio of orders of magnitude on combustion simulation datasets.

Paper Structure

This paper contains 10 sections, 19 equations, 6 figures, 2 tables, 2 algorithms.

Figures (6)

  • Figure 1: Decay of singular values i.e. absolute value of components of core tensor ${\boldsymbol{\alpha}}$ v/s rank of SP3D with TuckerMPI precision of $1.0\times 10^{-4}$. The rank on horizontal axis is converted to canonical rank by sorting the singular values in descending order.
  • Figure 2: (a) Approximation of $w^{(1)}_1(y_1)$ using least squares with $\ell_1$ regularization from data points as components of $W^{(1)}_{:,1}$ in the approximation space of Legendre polynomials of degree $p = 20$ and $p = 40$. (b) Point wise approximation error v/s grid index of the two approximations in (a)
  • Figure 3: Approximation of $w^{(1)}_{57}(y_1)$ using least squares with $\ell_1$ regularization from data points as components of $W^{(1)}_{:,57}$ in the approximation space of Legendre polynomials of degree $p = 20$ and wavelets with resolution level 5 and degree 3. (b) Point wise approximation error v/s grid index of the two approximations in (a)
  • Figure 4: Histogram plot of leverage scores of the measurement matrix ${\boldsymbol{W}}$ in (a) before mixing and (b) after mixing (Step 3. of Algorithm \ref{['alg:randlsq']})
  • Figure 5: Self convergence plot of ${\boldsymbol{\alpha}}$. Horizontal axis shows the number of rows $S$ sampled in Step 4. of algorithm \ref{['alg:randlsq']}. Vertical axis shows the change in ${\boldsymbol{\alpha}}$ (relative norm) estimated by sampling $S_1$ and $S_2$ rows ($S_2 = S_1+10^3, S_1\in\{3000,4000,\ldots,17000\}$).
  • ...and 1 more figures