Table of Contents
Fetching ...

LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping

Pascal Chang, Sergio Sancho, Jingwei Tang, Markus Gross, Vinicius C. Azevedo

TL;DR

LookingGlass introduces a latent-space, feed-forward approach to generating high-quality anamorphoses by marrying latent rectified flows with a Laplacian Pyramid Warping (LPW) framework. By operating in latent spaces and encoding transformations in image space, the method preserves details while supporting arbitrary view projections beyond simple 2D transformations. The LPW pipeline blends multi-view information across pyramid levels to reduce artifacts from extreme distortions, while VAE encoding/decoding and a residual correction minimize reconstruction errors. Quantitative and qualitative results show improved fidelity, CLIP alignment, and user-preferred outcomes over prior diffusion-based and pixel-space approaches, with a practical runtime on contemporary GPUs. The work enables robust, interpretable, and extensible generative anamorphoses with potential applications in generative texture mapping, panorama synthesis, and advanced perceptual illusions.

Abstract

Anamorphosis refers to a category of images that are intentionally distorted, making them unrecognizable when viewed directly. Their true form only reveals itself when seen from a specific viewpoint, which can be through some catadioptric device like a mirror or a lens. While the construction of these mathematical devices can be traced back to as early as the 17th century, they are only interpretable when viewed from a specific vantage point and tend to lose meaning when seen normally. In this paper, we revisit these famous optical illusions with a generative twist. With the help of latent rectified flow models, we propose a method to create anamorphic images that still retain a valid interpretation when viewed directly. To this end, we introduce Laplacian Pyramid Warping, a frequency-aware image warping technique key to generating high-quality visuals. Our work extends Visual Anagrams (arXiv:2311.17919) to latent space models and to a wider range of spatial transforms, enabling the creation of novel generative perceptual illusions.

LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping

TL;DR

LookingGlass introduces a latent-space, feed-forward approach to generating high-quality anamorphoses by marrying latent rectified flows with a Laplacian Pyramid Warping (LPW) framework. By operating in latent spaces and encoding transformations in image space, the method preserves details while supporting arbitrary view projections beyond simple 2D transformations. The LPW pipeline blends multi-view information across pyramid levels to reduce artifacts from extreme distortions, while VAE encoding/decoding and a residual correction minimize reconstruction errors. Quantitative and qualitative results show improved fidelity, CLIP alignment, and user-preferred outcomes over prior diffusion-based and pixel-space approaches, with a practical runtime on contemporary GPUs. The work enables robust, interpretable, and extensible generative anamorphoses with potential applications in generative texture mapping, panorama synthesis, and advanced perceptual illusions.

Abstract

Anamorphosis refers to a category of images that are intentionally distorted, making them unrecognizable when viewed directly. Their true form only reveals itself when seen from a specific viewpoint, which can be through some catadioptric device like a mirror or a lens. While the construction of these mathematical devices can be traced back to as early as the 17th century, they are only interpretable when viewed from a specific vantage point and tend to lose meaning when seen normally. In this paper, we revisit these famous optical illusions with a generative twist. With the help of latent rectified flow models, we propose a method to create anamorphic images that still retain a valid interpretation when viewed directly. To this end, we introduce Laplacian Pyramid Warping, a frequency-aware image warping technique key to generating high-quality visuals. Our work extends Visual Anagrams (arXiv:2311.17919) to latent space models and to a wider range of spatial transforms, enabling the creation of novel generative perceptual illusions.

Paper Structure

This paper contains 58 sections, 16 equations, 32 figures, 4 tables, 1 algorithm.

Figures (32)

  • Figure 1: We propose a method to generate ambiguous anamorphoses—images that reveal a hidden image when viewed through a mirror or lens. In the examples above, a conic mirror viewed from the top reveals a turtle hidden in an Earth image; a garden, seen through a lens, shows a bunny, and rotating the lens slightly reveals a gnome; a cylindrical mirror reflects a village painting into the face of an old man.
  • Figure 2: Laplacian Pyramid Warping. (a) The view mappings are generated using a ray tracer and the Level of Detail (LOD) map is computed. (b) For each pixel, our forward warping algorithm looks up in the warping UV mapping and LOD to fetch the corresponding value. (c) We consider different views, from 2D transformations like vertical flip and arbitrary angle rotation to complex 3D projections.
  • Figure 3: A real life demo of the cylindrical mirror illusion.
  • Figure 4: Our Proposed Pipeline. At each denoising step, the estimated final image is computed from the network velocity estimate and decoded into image space. Image warping and view aggregation is performed in image space using Laplacian pyramids, before encoding back into latent space for the diffusion step.
  • Figure 5: Latent Visual Anagrams. In this 135$^\circ$ rotation example, we demonstrate that contributions from \ref{['subsec: glva']} improve the generation of visual anagrams with latent models. While the final estimate from SyncTweedies Kim2024SyncTweedies partially addresses noise issues, artifacts from the VAE persist. Our VAE encoding/decoding process and residual correction further enhance image quality.
  • ...and 27 more figures