Table of Contents
Fetching ...

Intermediate Layer Optimization for Inverse Problems using Deep Generative Models

Giannis Daras, Joseph Dean, Ajil Jalal, Alexandros G. Dimakis

TL;DR

This work introduces Intermediate Layer Optimization (ILO), a method to solve inverse problems with pre-trained deep generators by progressively optimizing through intermediate layers. By expanding the search to an extended range around the previous layer via an $l_1$-ball constraint, ILO achieves improved recovery guarantees and empirical performance over prior CSGM-based approaches across inpainting, denoising, super-resolution, and compressed sensing, including structured measurements with partial circulant matrices. The authors provide a rigorous theoretical framework, including a main error bound and S-REC-based analysis, and demonstrate practical adaptations to StyleGAN-2, along with classifier-guided generation in a controlled setting. They also release code and discuss ethical considerations, highlighting both the potential for advancement and risks associated with broader generative capabilities.

Abstract

We propose Intermediate Layer Optimization (ILO), a novel optimization algorithm for solving inverse problems with deep generative models. Instead of optimizing only over the initial latent code, we progressively change the input layer obtaining successively more expressive generators. To explore the higher dimensional spaces, our method searches for latent codes that lie within a small $l_1$ ball around the manifold induced by the previous layer. Our theoretical analysis shows that by keeping the radius of the ball relatively small, we can improve the established error bound for compressed sensing with deep generative models. We empirically show that our approach outperforms state-of-the-art methods introduced in StyleGAN-2 and PULSE for a wide range of inverse problems including inpainting, denoising, super-resolution and compressed sensing.

Intermediate Layer Optimization for Inverse Problems using Deep Generative Models

TL;DR

This work introduces Intermediate Layer Optimization (ILO), a method to solve inverse problems with pre-trained deep generators by progressively optimizing through intermediate layers. By expanding the search to an extended range around the previous layer via an -ball constraint, ILO achieves improved recovery guarantees and empirical performance over prior CSGM-based approaches across inpainting, denoising, super-resolution, and compressed sensing, including structured measurements with partial circulant matrices. The authors provide a rigorous theoretical framework, including a main error bound and S-REC-based analysis, and demonstrate practical adaptations to StyleGAN-2, along with classifier-guided generation in a controlled setting. They also release code and discuss ethical considerations, highlighting both the potential for advancement and risks associated with broader generative capabilities.

Abstract

We propose Intermediate Layer Optimization (ILO), a novel optimization algorithm for solving inverse problems with deep generative models. Instead of optimizing only over the initial latent code, we progressively change the input layer obtaining successively more expressive generators. To explore the higher dimensional spaces, our method searches for latent codes that lie within a small ball around the manifold induced by the previous layer. Our theoretical analysis shows that by keeping the radius of the ball relatively small, we can improve the established error bound for compressed sensing with deep generative models. We empirically show that our approach outperforms state-of-the-art methods introduced in StyleGAN-2 and PULSE for a wide range of inverse problems including inpainting, denoising, super-resolution and compressed sensing.

Paper Structure

This paper contains 28 sections, 11 theorems, 53 equations, 7 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Let $G=G_2\circ G_1$ with $G_1:\mathbb R^k \to \mathbb R^p$ be an $L_1$-Lipschitz function and $G_2:\mathbb R^p \to \mathbb R^n$ be an $L_2$-Lipschitz function. Let $A \in R^{m\times n}$ be the measurements matrix with $A_{ij} \sim \mathcal{N}(0, 1/m)$ i.i.d. entries. Let $K$ be a parameter of our c and the measurements optimum in the extended range Then, if the number of measurements is sufficie

Figures (7)

  • Figure 1: Results on the inpainting task. Rows 1, 2, 3 and 5 are real images (outside of the test set, collected from the web) while rows 4, 6 are StyleGAN-2 generated images. Column 2: the first five images have masks that were chosen to remove important facial features. The last row is an example of randomized inpainting, i.e. a random $1\%$ of the total pixels is observed. Columns 3-5: reconstructions using the CSGM bora2017compressed algorithm with the StyleGAN-2 generator and the optimization setting described in PULSE pulse. While PULSE only applies to super-resolution, we extend it using MSE, LPIPS and jointly MSE+LPIPS loss. The experiments of Columns 3-5 form an ablation study of the benefits of each loss function. Column 6: reconstructions with ILO (ours). As shown, ILO consistently gives better reconstructions of the original image. Also, many biased reconstructions can be corrected by our method. In the last two rows, recovery of the image is still possible from very few pixel observations using our method.
  • Figure 2: Plots showing the true MSE error on Celeba-HQ images, i.e. the MSE between the real image (that we never observe) and the reconstructed image from the measurements. From left to right: Inpainting, Denoising, Super-resolution and Compressed sensing with partial circulant matrices. As shown, ILO significantly outperforms all previous methods except in the very noisy regime.
  • Figure 3: Results on the super-resolution task. ILO (ours) gives more accurate reconstructions comparing to PULSE (third column) and other baselines. Many biased reconstructions can be corrected by applying ILO on the weighted combination of MSE and LPIPS.
  • Figure 4: Illustration of using a classifier as a differentiable forward operator. Here we assume that the only observation is $y=\mathcal{A}(x|\textrm{class})$ where $\mathcal{A}$ is an ImageNet classifier. The classes used in this Figure are (from top-left): Frog, Coral, Irish Wolf Dog, Goldfish, Boston Terrier Dog and Apple. We use a robust classifier as proposed by santurkar2019image and solve the inverse problem to generate images that look like these classes. The difference with santurkar2019image is that we perform the search using ILO in the StyleGAN-2 generator latent spaces as opposed to pixels and that keeps images closer to human faces.
  • Figure 5: Morphing using a classifier for Bull Frog class, keeping also a loss term for distance to a well-known machine learning researcher, with his permission.
  • ...and 2 more figures

Theorems & Definitions (26)

  • Definition 1: S-REC bora2017compressed
  • Theorem 1
  • Remark 1: Choice of $K$
  • Remark 2: CSGM sample bound applied directly on the intermediate layer
  • Remark 3: Parameter Scaling
  • Lemma 1
  • Definition 2: Covering number wainwright2019high
  • Definition 3: Packing number wainwright2019high
  • Lemma 2: wainwright2019high
  • Theorem 2: Maurey's Empirical Method maurey
  • ...and 16 more