Table of Contents
Fetching ...

Convolutional Optical Encoders for Generalizable Image Compression

Yubo Zhang, Rui Chen, Zhihao Zhou, Arka Majumdar

TL;DR

This work systematically study several PSF encoding strategies combined with a total-variation (TV) digital reconstruction backend, and shows that spatial binning achieves the highest reconstruction quality among all encoding strategies; however, it exhibits limited robustness to noise compared with multi-channel methods.

Abstract

We investigate the utility of meta-optical encoders for generalizable image compression by leveraging their intrinsic shift-invariant point spread functions (PSFs). Compared with purely digital approaches, such optical encoders offer parallel and energy-efficient compression, enabling early data reduction prior to electronic processing and transmission, which is particularly attractive for resource-constrained and compact imaging systems. Although the operations realizable by a single passive optical layer remain fundamentally constrained, we systematically study several PSF encoding strategies combined with a total-variation (TV) digital reconstruction backend. Specifically, under identical compression ratios, we compare spatial binning, multi-channel random, and multi-channel orthogonal PSF based designs. Our results show that, at the same compression ratios, spatial binning achieves the highest reconstruction quality among all encoding strategies; however, it exhibits limited robustness to noise compared with multi-channel methods.

Convolutional Optical Encoders for Generalizable Image Compression

TL;DR

This work systematically study several PSF encoding strategies combined with a total-variation (TV) digital reconstruction backend, and shows that spatial binning achieves the highest reconstruction quality among all encoding strategies; however, it exhibits limited robustness to noise compared with multi-channel methods.

Abstract

We investigate the utility of meta-optical encoders for generalizable image compression by leveraging their intrinsic shift-invariant point spread functions (PSFs). Compared with purely digital approaches, such optical encoders offer parallel and energy-efficient compression, enabling early data reduction prior to electronic processing and transmission, which is particularly attractive for resource-constrained and compact imaging systems. Although the operations realizable by a single passive optical layer remain fundamentally constrained, we systematically study several PSF encoding strategies combined with a total-variation (TV) digital reconstruction backend. Specifically, under identical compression ratios, we compare spatial binning, multi-channel random, and multi-channel orthogonal PSF based designs. Our results show that, at the same compression ratios, spatial binning achieves the highest reconstruction quality among all encoding strategies; however, it exhibits limited robustness to noise compared with multi-channel methods.
Paper Structure (4 sections, 1 equation, 4 figures)

This paper contains 4 sections, 1 equation, 4 figures.

Figures (4)

  • Figure 1: Schematic of our workingflow for image compression, where the convolution encoding is performed by meta-optics. We apply average binning as region of interst (ROI) detection to compress the image after convolution. Linear regression with total variation (TV) regularization is used to reconstruct the image.
  • Figure 2: Reconstruction VIF (Visual Information Fidelity) vs Compression Ratio on Arimp4 (up) and Celeba (down) dataset with different kernel design strategies.
  • Figure 3: Reconstruction VIF on TV regularization hyperparameter $\lambda$ ranging from $10^{-7}$ to $10^{3}$. Larger values of $\lambda$ impose a stronger prior favoring piecewise-smooth images with smoother backgrounds. All results in this sweep are obtained on the Arimp4 dataset.
  • Figure 4: Comparison of reconstruction quality across varying SNRs between 4 strategies. The left column shows PSNR and its corresponding dB-scale loss, while the right column shows VIF and its corresponding dB-scale loss.