Table of Contents
Fetching ...

Theoretical Bound-Guided Hierarchical VAE for Neural Image Codecs

Yichi Zhang, Zhihao Duan, Yuning Huang, Fengqing Zhu

TL;DR

The proposed BG-VAE leverages the theoretical bound to guide the NIC model towards enhanced performance and provides a versatile, variable-rate NIC that outperforms existing methods when considering both ratedistortion performance and computational complexity.

Abstract

Recent studies reveal a significant theoretical link between variational autoencoders (VAEs) and rate-distortion theory, notably in utilizing VAEs to estimate the theoretical upper bound of the information rate-distortion function of images. Such estimated theoretical bounds substantially exceed the performance of existing neural image codecs (NICs). To narrow this gap, we propose a theoretical bound-guided hierarchical VAE (BG-VAE) for NIC. The proposed BG-VAE leverages the theoretical bound to guide the NIC model towards enhanced performance. We implement the BG-VAE using Hierarchical VAEs and demonstrate its effectiveness through extensive experiments. Along with advanced neural network blocks, we provide a versatile, variable-rate NIC that outperforms existing methods when considering both rate-distortion performance and computational complexity. The code is available at BG-VAE.

Theoretical Bound-Guided Hierarchical VAE for Neural Image Codecs

TL;DR

The proposed BG-VAE leverages the theoretical bound to guide the NIC model towards enhanced performance and provides a versatile, variable-rate NIC that outperforms existing methods when considering both ratedistortion performance and computational complexity.

Abstract

Recent studies reveal a significant theoretical link between variational autoencoders (VAEs) and rate-distortion theory, notably in utilizing VAEs to estimate the theoretical upper bound of the information rate-distortion function of images. Such estimated theoretical bounds substantially exceed the performance of existing neural image codecs (NICs). To narrow this gap, we propose a theoretical bound-guided hierarchical VAE (BG-VAE) for NIC. The proposed BG-VAE leverages the theoretical bound to guide the NIC model towards enhanced performance. We implement the BG-VAE using Hierarchical VAEs and demonstrate its effectiveness through extensive experiments. Along with advanced neural network blocks, we provide a versatile, variable-rate NIC that outperforms existing methods when considering both rate-distortion performance and computational complexity. The code is available at BG-VAE.
Paper Structure (24 sections, 13 equations, 10 figures, 5 tables)

This paper contains 24 sections, 13 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: An overview of the proposed BG-VAE. (a) the bound-guided framework, (b) the model used to implement BG-VAE.
  • Figure 1: The structure of Wavelet Up/Down sampling.
  • Figure 2: The structure of Balanced ConvNeXt block.
  • Figure 2: The structure of Cross Attention block.
  • Figure 3: Illustration of the $i$-th Latent Variable Block.
  • ...and 5 more figures