Table of Contents
Fetching ...

Rethinking Image Compression on the Web with Generative AI

Shayan Ali Hassan, Danish Humair, Ihsan Ayyub Qazi, Zafar Ayyub Qazi

TL;DR

The study investigates a generative AI–driven approach to web image compression by reconstructing images at the edge using a Pseudo-Lossy Compression framework that transmits semantic and structural conditioning inputs (text prompts, Canny edges, color palettes, and in-painting masks) to a text-to-image model. This method achieves substantial bandwidth savings—up to 99.8% in best cases and 92.6% on average—while preserving perceptual content as measured by VGG16 embeddings and supported by a user study. The work quantifies the bandwidth–similarity trade-offs across multiple conditioning configurations, demonstrates the importance of preserving salient features, and discusses practical considerations for real-time deployment, ethics, and standards. Overall, the approach offers a promising direction for reducing web-image data transfer without severely compromising meaning or structure, with potential implications for internet affordability and infrastructure costs.

Abstract

The rapid growth of the Internet, driven by social media, web browsing, and video streaming, has made images central to the Web experience, resulting in significant data transfer and increased webpage sizes. Traditional image compression methods, while reducing bandwidth, often degrade image quality. This paper explores a novel approach using generative AI to reconstruct images at the edge or client-side. We develop a framework that leverages text prompts and provides additional conditioning inputs like Canny edges and color palettes to a text-to-image model, achieving up to 99.8% bandwidth savings in the best cases and 92.6% on average, while maintaining high perceptual similarity. Empirical analysis and a user study show that our method preserves image meaning and structure more effectively than traditional compression methods, offering a promising solution for reducing bandwidth usage and improving Internet affordability with minimal degradation in image quality.

Rethinking Image Compression on the Web with Generative AI

TL;DR

The study investigates a generative AI–driven approach to web image compression by reconstructing images at the edge using a Pseudo-Lossy Compression framework that transmits semantic and structural conditioning inputs (text prompts, Canny edges, color palettes, and in-painting masks) to a text-to-image model. This method achieves substantial bandwidth savings—up to 99.8% in best cases and 92.6% on average—while preserving perceptual content as measured by VGG16 embeddings and supported by a user study. The work quantifies the bandwidth–similarity trade-offs across multiple conditioning configurations, demonstrates the importance of preserving salient features, and discusses practical considerations for real-time deployment, ethics, and standards. Overall, the approach offers a promising direction for reducing web-image data transfer without severely compromising meaning or structure, with potential implications for internet affordability and infrastructure costs.

Abstract

The rapid growth of the Internet, driven by social media, web browsing, and video streaming, has made images central to the Web experience, resulting in significant data transfer and increased webpage sizes. Traditional image compression methods, while reducing bandwidth, often degrade image quality. This paper explores a novel approach using generative AI to reconstruct images at the edge or client-side. We develop a framework that leverages text prompts and provides additional conditioning inputs like Canny edges and color palettes to a text-to-image model, achieving up to 99.8% bandwidth savings in the best cases and 92.6% on average, while maintaining high perceptual similarity. Empirical analysis and a user study show that our method preserves image meaning and structure more effectively than traditional compression methods, offering a promising solution for reducing bandwidth usage and improving Internet affordability with minimal degradation in image quality.
Paper Structure (13 sections, 5 figures, 1 table)

This paper contains 13 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Design space of using generative AI for image optimization.
  • Figure 2: The Pseudo-Lossy Compression Framework.
  • Figure 3: Most bandwidth savings come from converting images to Canny edge-maps and further reductions can be obtained by utilizing JBIG2 compression.
  • Figure 4: This figure illustrates the trade-off between expected bandwidth reduction and similarity (measured by VGG16) for the first three experiments.
  • Figure 5: Results from the user study showing (a) structural similarity ratings, (b) meaning preservation responses, and (c) preference between compressed and reconstructed images.