Table of Contents
Fetching ...

Learned Compression for Images and Point Clouds

Mateen Ulhaq

TL;DR

This thesis provides an efficient low-complexity entropy model that dynamically adapts the encoding distribution to a specific input by compressing and transmitting the encoding distribution itself as side information and proposes a novel lightweight low-complexity point cloud codec.

Abstract

Over the last decade, deep learning has shown great success at performing computer vision tasks, including classification, super-resolution, and style transfer. Now, we apply it to data compression to help build the next generation of multimedia codecs. This thesis provides three primary contributions to this new field of learned compression. First, we present an efficient low-complexity entropy model that dynamically adapts the encoding distribution to a specific input by compressing and transmitting the encoding distribution itself as side information. Secondly, we propose a novel lightweight low-complexity point cloud codec that is highly specialized for classification, attaining significant reductions in bitrate compared to non-specialized codecs. Lastly, we explore how motion within the input domain between consecutive video frames is manifested in the corresponding convolutionally-derived latent space.

Learned Compression for Images and Point Clouds

TL;DR

This thesis provides an efficient low-complexity entropy model that dynamically adapts the encoding distribution to a specific input by compressing and transmitting the encoding distribution itself as side information and proposes a novel lightweight low-complexity point cloud codec.

Abstract

Over the last decade, deep learning has shown great success at performing computer vision tasks, including classification, super-resolution, and style transfer. Now, we apply it to data compression to help build the next generation of multimedia codecs. This thesis provides three primary contributions to this new field of learned compression. First, we present an efficient low-complexity entropy model that dynamically adapts the encoding distribution to a specific input by compressing and transmitting the encoding distribution itself as side information. Secondly, we propose a novel lightweight low-complexity point cloud codec that is highly specialized for classification, attaining significant reductions in bitrate compared to non-specialized codecs. Lastly, we explore how motion within the input domain between consecutive video frames is manifested in the corresponding convolutionally-derived latent space.
Paper Structure (42 sections, 42 equations, 22 figures, 6 tables)

This paper contains 42 sections, 42 equations, 22 figures, 6 tables.

Figures (22)

  • Figure 1: RD curves for image compression codecs on the Kodak dataset kodak_dataset.
  • Figure 2: High-level comparison of codec architectures.
  • Figure 3: Visualization of an encoding distribution used for compressing a single element $\hat{y}_i$.
  • Figure 4: Visualization of encoding distributions used for compressing a latent tensor ${\boldsymbol{\hat{y}}}$ with dimensions $M_y \times H_y \times W_y$. In (a), the encoding distributions within a given channel are all the same since the elements within a channel are assumed to be i.i.d. w.r.t. each other. Furthermore, in the case of the fully factorized entropy bottleneck used by Ballé et al.balle2018variational, each encoding distribution is a static non-parametric distribution. In (b), the encoding distributions for each element are uniquely determined, and conditioned on side information. Furthermore, in the case of the Gaussian conditional hyperprior used by Ballé et al.balle2018variational, the encoding distributions are Gaussian distributions parameterized by a mean and variance.
  • Figure 5: Visualization of the suboptimality of using a single static encoding distribution. This distribution tries to "average" (in an amortized information-theoretic sense) all the best possible data-specific distributions. However, the resulting distribution is less optimal than data-specific distributions.
  • ...and 17 more figures