Exploring Invariance in Images through One-way Wave Equations

Yinpeng Chen; Dongdong Chen; Xiyang Dai; Mengchen Liu; Yinan Feng; Youzuo Lin; Lu Yuan; Zicheng Liu

Exploring Invariance in Images through One-way Wave Equations

Yinpeng Chen, Dongdong Chen, Xiyang Dai, Mengchen Liu, Yinan Feng, Youzuo Lin, Lu Yuan, Zicheng Liu

TL;DR

An invariance over images-images share a set of one-way wave equations with latent speeds, allowing for its reconstruction with high fidelity from an initial condition to be demonstrated using an intuitive encoder-decoder framework.

Abstract

In this paper, we empirically reveal an invariance over images-images share a set of one-way wave equations with latent speeds. Each image is uniquely associated with a solution to these wave equations, allowing for its reconstruction with high fidelity from an initial condition. We demonstrate it using an intuitive encoder-decoder framework where each image is encoded into its corresponding initial condition (a single vector). Subsequently, the initial condition undergoes a specialized decoder, transforming the one-way wave equations into a first-order norm + linear autoregressive process. This process propagates the initial condition along the x and y directions, generating a high-resolution feature map (up to the image resolution), followed by a few convolutional layers to reconstruct image pixels. The revealed invariance, rooted in the shared wave equations, offers a fresh perspective for comprehending images, establishing a promising avenue for further exploration.

Exploring Invariance in Images through One-way Wave Equations

TL;DR

Abstract

Paper Structure (39 sections, 13 equations, 21 figures, 30 tables)

This paper contains 39 sections, 13 equations, 21 figures, 30 tables.

Introduction
First-order Norm+Linear Autoregression
Generalization to One-way Wave Equations
Experiments on Image Reconstruction
Main Properties
Comparison with Previous Techniques
Comparison with JPEG on Image Compression
Application on Self-Supervised Learning
Related Work
Limitations
Conclusion
FINOLA for Image Reconstruction
Conceptual Comparison with DCT/Wavelet Transforms
Implementation Details
Network Architectures
...and 24 more sections

Figures (21)

Figure 1: Exploring invariance through one-way wave equations. All images share a set of one-way wave equations $\frac{\partial \bm{\zeta}}{\partial x}=\bm{\Lambda} \frac{\partial \bm{\zeta}}{\partial y}$ (or transportation equations). Each image corresponds (to a good approximation) to a unique solution with an initial condition $\bm{\zeta}(\frac{W}{2},\frac{H}{2})$ derived from the original image. The solution $\bm{\zeta}(x,y)$ is a feature map (with resolutions of $\frac{1}{4}$ or $\frac{1}{2}$ or full resolution of the original image) facilitates image reconstruction using a few upsampling and convolutional layers. The wave speeds, $\lambda_1, \ldots, \lambda_C$, are latent and learnable.
Figure 2: FINOLA for image reconstruction. Each image is firstly encoded into a single vector $\bm{q}$. Then, FINOLA is applied to $\bm{q}$ to iteratively generate the feature map $\bm{z}(x, y)$ through a first-order norm+linear autoregression. Finally, a few upsampling and convolutional layers are used to reconstruct image pixels. Best viewed in color.
Figure 3: Multi-path FINOLA: The input image is encoded into $M$ vectors $\bm{q}_1,\dots,\bm{q}_M$. Then the shared FINOLA is applied on each $\bm{q}_i$ to generate feature maps $\bm{\phi}_i(x,y)$, which are aggregated ($\bm{z}=\sum_i\bm{\phi}_i$) to pass through upsampling and convolution layers to reconstruct image pixels.
Figure 4: Parallel implementation of FINOLA: Horizontal and vertical regressions are separated. The top approach performs horizontal regression first, enabling parallel vertical regression. Similarly, the bottom approach starts with vertical regression, enabling parallel horizontal regression. The results of these approaches are averaged, corresponding to the two autoregression paths from the initial position marked by $\bm{q}$. Best viewed in color.
Figure 5: Reconstruction PSNR for multi-path FINOLA. The generated feature map has a resolution of 64$\times$64, and the image size is 256$\times$256. Increasing the number of paths $M$, as defined in Eq. \ref{['eq:group-finola']}, consistently enhances reconstruction PSNR across various dimensions ($C=128$ to $C=2048$). The blue lines in the right table represent contour lines of the latent size (equal to $MC$). PSNR remains consistent along each latent size line. Best viewed in color.
...and 16 more figures

Exploring Invariance in Images through One-way Wave Equations

TL;DR

Abstract

Exploring Invariance in Images through One-way Wave Equations

Authors

TL;DR

Abstract

Table of Contents

Figures (21)