Table of Contents
Fetching ...

Decomposing Private Image Generation via Coarse-to-Fine Wavelet Modeling

Jasmine Bayrooti, Weiwei Kong, Natalia Ponomareva, Carlos Esteves, Ameesh Makadia, Amanda Prorok

TL;DR

A spectral DP framework based on the hypothesis that the most privacy-sensitive portions of an image are often low-frequency components in the wavelet space while high-frequency components are largely generic and public is proposed, which achieves promising trade-offs between privacy and utility.

Abstract

Generative models trained on sensitive image datasets risk memorizing and reproducing individual training examples, making strong privacy guarantees essential. While differential privacy (DP) provides a principled framework for such guarantees, standard DP finetuning (e.g., with DP-SGD) often results in severe degradation of image quality, particularly in high-frequency textures, due to the indiscriminate addition of noise across all model parameters. In this work, we propose a spectral DP framework based on the hypothesis that the most privacy-sensitive portions of an image are often low-frequency components in the wavelet space (e.g., facial features and object shapes) while high-frequency components are largely generic and public. Based on this hypothesis, we propose the following two-stage framework for DP image generation with coarse image intermediaries: (1) DP finetune an autoregressive spectral image tokenizer model on the low-resolution wavelet coefficients of the sensitive images, and (2) perform high-resolution upsampling using a publicly pretrained super-resolution model. By restricting the privacy budget to the global structures of the image in the first stage, and leveraging the post-processing property of DP for detail refinement, we achieve promising trade-offs between privacy and utility. Experiments on the MS-COCO and MM-CelebA-HQ datasets show that our method generates images with improved quality and style capture relative to other leading DP image frameworks.

Decomposing Private Image Generation via Coarse-to-Fine Wavelet Modeling

TL;DR

A spectral DP framework based on the hypothesis that the most privacy-sensitive portions of an image are often low-frequency components in the wavelet space while high-frequency components are largely generic and public is proposed, which achieves promising trade-offs between privacy and utility.

Abstract

Generative models trained on sensitive image datasets risk memorizing and reproducing individual training examples, making strong privacy guarantees essential. While differential privacy (DP) provides a principled framework for such guarantees, standard DP finetuning (e.g., with DP-SGD) often results in severe degradation of image quality, particularly in high-frequency textures, due to the indiscriminate addition of noise across all model parameters. In this work, we propose a spectral DP framework based on the hypothesis that the most privacy-sensitive portions of an image are often low-frequency components in the wavelet space (e.g., facial features and object shapes) while high-frequency components are largely generic and public. Based on this hypothesis, we propose the following two-stage framework for DP image generation with coarse image intermediaries: (1) DP finetune an autoregressive spectral image tokenizer model on the low-resolution wavelet coefficients of the sensitive images, and (2) perform high-resolution upsampling using a publicly pretrained super-resolution model. By restricting the privacy budget to the global structures of the image in the first stage, and leveraging the post-processing property of DP for detail refinement, we achieve promising trade-offs between privacy and utility. Experiments on the MS-COCO and MM-CelebA-HQ datasets show that our method generates images with improved quality and style capture relative to other leading DP image frameworks.
Paper Structure (19 sections, 2 theorems, 9 equations, 7 figures, 5 tables, 2 algorithms)

This paper contains 19 sections, 2 theorems, 9 equations, 7 figures, 5 tables, 2 algorithms.

Key Result

Proposition 1

Let $\mathcal{M}: \mathcal{D} \to \mathcal{R}$ be an $(\epsilon, \delta)$-DP algorithm, and let $f: \mathcal{R} \to \mathcal{R}'$ be an arbitrary (possibly randomized) mapping. Then the composition $f \circ \mathcal{M}: \mathcal{D} \to \mathcal{R}'$ is $(\epsilon, \delta)$-DP.

Figures (7)

  • Figure 1: Illustration of the DP-Wavelet method.
  • Figure 2: Visualization of the 2D Discrete Wavelet Transform. Left. Original image, Center. A one-level decomposition where the image is split into the approximation $LL_0$ and details $\{LH_0, HL_0, HH_0\}$, Right, A two-level decomposition where the image is first split into scale 1 details in blue and a scale 1 approximation, which is then recursively split into scale 0 details in green and the final approximation $LL_0$ in red. Note that indices are relative to the coarsest scale $LL_0$.
  • Figure 3: Left. Pixel-based patchification, Right. Wavelet-based patchification where separate codebooks are used for the red, green, and blue patches esteves2025spectral.
  • Figure 4: An example of AR-SIT's partial generation of images using 25%, 50%, 75%, and 100% (left-to-right) of the full AR transformer token sequence esteves2025spectral.
  • Figure 5: Sample images generated by the benchmark algorithms for MS-COCO and MM-CelebA-HQ.
  • ...and 2 more figures

Theorems & Definitions (4)

  • Definition 1
  • Proposition 1: Post-Processing dwork2014algorithmic
  • Theorem 1
  • proof