Table of Contents
Fetching ...

Pattern Analogies: Learning to Perform Programmatic Image Edits by Analogy

Aditya Ganeshan, Thibault Groueix, Paul Guerrero, Radomír Měch, Matthew Fisher, Daniel Ritchie

TL;DR

This work introduces an analogy-driven framework for programmatic edits of pattern images that bypass explicit program inference. It combines SplitWeave, a DSL for diverse, structured pattern generation and test-time analogy inputs, with TriFuser, a Latent Diffusion Model augmented to perform analogical edits by conditioning on patterns (A, A', B). The approach is trained on a large synthetic corpus of analogical quartets and demonstrates strong fidelity and generalization to real-world and novel pattern styles, outperforming several baselines in perceptual and structural metrics. The results enable practical applications such as pattern mixing and animation transfer, illustrating the method's potential to empower designers with intuitive, structure-preserving edits across diverse pattern domains.

Abstract

Pattern images are everywhere in the digital and physical worlds, and tools to edit them are valuable. But editing pattern images is tricky: desired edits are often programmatic: structure-aware edits that alter the underlying program which generates the pattern. One could attempt to infer this underlying program, but current methods for doing so struggle with complex images and produce unorganized programs that make editing tedious. In this work, we introduce a novel approach to perform programmatic edits on pattern images. By using a pattern analogy -- a pair of simple patterns to demonstrate the intended edit -- and a learning-based generative model to execute these edits, our method allows users to intuitively edit patterns. To enable this paradigm, we introduce SplitWeave, a domain-specific language that, combined with a framework for sampling synthetic pattern analogies, enables the creation of a large, high-quality synthetic training dataset. We also present TriFuser, a Latent Diffusion Model (LDM) designed to overcome critical issues that arise when naively deploying LDMs to this task. Extensive experiments on real-world, artist-sourced patterns reveals that our method faithfully performs the demonstrated edit while also generalizing to related pattern styles beyond its training distribution.

Pattern Analogies: Learning to Perform Programmatic Image Edits by Analogy

TL;DR

This work introduces an analogy-driven framework for programmatic edits of pattern images that bypass explicit program inference. It combines SplitWeave, a DSL for diverse, structured pattern generation and test-time analogy inputs, with TriFuser, a Latent Diffusion Model augmented to perform analogical edits by conditioning on patterns (A, A', B). The approach is trained on a large synthetic corpus of analogical quartets and demonstrates strong fidelity and generalization to real-world and novel pattern styles, outperforming several baselines in perceptual and structural metrics. The results enable practical applications such as pattern mixing and animation transfer, illustrating the method's potential to empower designers with intuitive, structure-preserving edits across diverse pattern domains.

Abstract

Pattern images are everywhere in the digital and physical worlds, and tools to edit them are valuable. But editing pattern images is tricky: desired edits are often programmatic: structure-aware edits that alter the underlying program which generates the pattern. One could attempt to infer this underlying program, but current methods for doing so struggle with complex images and produce unorganized programs that make editing tedious. In this work, we introduce a novel approach to perform programmatic edits on pattern images. By using a pattern analogy -- a pair of simple patterns to demonstrate the intended edit -- and a learning-based generative model to execute these edits, our method allows users to intuitively edit patterns. To enable this paradigm, we introduce SplitWeave, a domain-specific language that, combined with a framework for sampling synthetic pattern analogies, enables the creation of a large, high-quality synthetic training dataset. We also present TriFuser, a Latent Diffusion Model (LDM) designed to overcome critical issues that arise when naively deploying LDMs to this task. Extensive experiments on real-world, artist-sourced patterns reveals that our method faithfully performs the demonstrated edit while also generalizing to related pattern styles beyond its training distribution.

Paper Structure

This paper contains 14 sections, 4 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Our system performs programmatic edits on pattern images without inferring their underlying programs. (Left) Desired edits, expressed with a pair of patterns $(A, A^\prime)$, are executed on a target pattern $B$ by a generative model to produce $B^\prime$. (Right) Parametric changes $A \rightarrow A^\prime$ enabled by our domain-specific pattern language induce corresponding changes to the more complex pattern $B$.
  • Figure 2: Overview: To create high-quality visual patterns, we introduce a custom DSL called SplitWeave. Pairs of SplitWeave programs $(A, B)$ are then jointly edited to create analogical quartets. This synthetic data is then used to train TriFuser, a neural network for analogical pattern editing.
  • Figure 3: Custom program samplers for two pattern styles. Our samplers produce diverse and high-quality patterns, enabling generalization to real-world patterns.
  • Figure 4: We create synthetic analogical quartets $(A, A^\prime, B, B^\prime)$ with consistent edits between $A$ and $B$ pairs, providing data for training an analogical editing models.
  • Figure 5: (Left) TriFuser is a latent diffusion model conditioned on patch-wise tokens of the input images $(A, A^\prime, B)$ to generate the analogically edited pattern $B^\prime$. (Right) To achieve high-quality edits, we enrich these tokens by fusing multi-level features from multiple encoders, followed by a 3D positional encoding: 2D to specify spatial locations and 1D to specify the token's source ($A$, $A^\prime$ or $B$).
  • ...and 3 more figures