Exploration of Learned Lifting-Based Transform Structures for Fully Scalable and Accessible Wavelet-Like Image Compression

Xinyue Li; Aous Naman; David Taubman

Exploration of Learned Lifting-Based Transform Structures for Fully Scalable and Accessible Wavelet-Like Image Compression

Xinyue Li, Aous Naman, David Taubman

TL;DR

Experimental results ultimately suggest that retaining fixed lifting steps from the base wavelet transform is highly beneficial and it is demonstrated that employing more learned lifting steps and more layers in each learned lifting operator do not contribute strongly to the compression performance, but benefits can be obtained by utilizing more channels in each learned lifting operator.

Abstract

This paper provides a comprehensive study on features and performance of different ways to incorporate neural networks into lifting-based wavelet-like transforms, within the context of fully scalable and accessible image compression. Specifically, we explore different arrangements of lifting steps, as well as various network architectures for learned lifting operators. Moreover, we examine the impact of the number of learned lifting steps, the number of channels, the number of layers and the support of kernels in each learned lifting operator. To facilitate the study, we investigate two generic training methodologies that are simultaneously appropriate to a wide variety of lifting structures considered. Experimental results ultimately suggest that retaining fixed lifting steps from the base wavelet transform is highly beneficial. Moreover, we demonstrate that employing more learned lifting steps and more layers in each learned lifting operator do not contribute strongly to the compression performance. However, benefits can be obtained by utilizing more channels in each learned lifting operator. Ultimately, the learned wavelet-like transform proposed in this paper achieves over 25% bit-rate savings compared to JPEG 2000 with compact spatial support.

Exploration of Learned Lifting-Based Transform Structures for Fully Scalable and Accessible Wavelet-Like Image Compression

TL;DR

Abstract

Paper Structure (27 sections, 6 equations, 14 figures, 3 tables)

This paper contains 27 sections, 6 equations, 14 figures, 3 tables.

Introduction
Investigated lifting structures
Predict-update lifting structure
Update-predict lifting structure
Hybrid lifting structure
Lifting structures with more learned lifting steps
Investigated Network Topologies
Significance of the proposal-opacity topology
Particular properties of the opacity branch
Proposed lifting networks
End-to-end optimisation framework and pre-training strategies
End-to-end optimisation with backward annealing
Investigated pre-training strategies
Oracle-opacity pre-training schedule
Pre-training with progressive selection
...and 12 more sections

Figures (14)

Figure 1: (a) The general predict-update lifting structure in one dimension, where $\mathcal{P}$ and $\mathcal{U}$ denote the predict and the update operators, respectively. (b) An example of two-dimensional predict-update lifting structure, in which only two steps are shown for horizontal and vertical directions; this is essentially the lifting structure of the LeGall 5/3 wavelet transform. The symbols $\mathcal{P}^{V}$, $\mathcal{U}^{V}$, $\mathcal{P}^{H}$ and $\mathcal{U}^{H}$ denote the vertical-predict, vertical-update, horizontal-predict and horizontal-update operators, respectively.
Figure 2: The update-predict lifting structure explored in this paper; this structure was first proposed in claypoole2003nonlinear, and was also employed to build the iWave transform in ma2019iwave.
Figure 3: The hybrid lifting structure introduced in this paper; $T^{A}_{H2L}$ and $T^{A}_{L2H}$ denote the high-to-low and the low-to-high operators using neural networks.
Figure 4: The extension of neural networks to the base wavelet transform shown in Fig. \ref{['fig:hybrid_LS']}. Note that the last update step $\mathcal{U}^H$ can be fused into the high-to-low step $\mathcal{T}^H_{H2L}$.
Figure 5: The proposal-opacity neural network topology proposed in our previous work Xinyue2022_journal. The symbol $K \text{~x~} K$ denotes the filter support while $N$ represents the number of filters (or equivalently the number of channels in the proposal/opacity branch).
...and 9 more figures

Exploration of Learned Lifting-Based Transform Structures for Fully Scalable and Accessible Wavelet-Like Image Compression

TL;DR

Abstract

Exploration of Learned Lifting-Based Transform Structures for Fully Scalable and Accessible Wavelet-Like Image Compression

Authors

TL;DR

Abstract

Table of Contents

Figures (14)