Table of Contents
Fetching ...

Generative Anchored Fields: Controlled Data Generation via Emergent Velocity Fields and Transport Algebra

Deressa Wodajo Deressa, Hannes Mareen, Peter Lambert, Glenn Van Wallendael

TL;DR

<3-5 sentence high-level summary> Generative Anchored Fields (GAF) reframes data generation by learning independent endpoint predictors anchored to noise and data endpoints, rather than a single trajectory model. The emergent velocity field v = K - J arises from the time-conditioned disagreement between these endpoints, enabling Transport Algebra for compositional generation across classes and modalities. Through a modular trunk and per-class K heads, GAF achieves competitive sample quality and introduces lossless cyclic transport (LPIPS = 0) and precise semantic manipulation as intrinsic architectural primitives. This framework opens avenues for intrinsic, deterministic, and scalable compositional generation, with potential extensions to sequential domains such as video through Motion Algebra.

Abstract

We present Generative Anchored Fields (GAF), a generative model that learns independent endpoint predictors $J$ (noise) and $K$ (data) rather than a trajectory predictor. The velocity field $v=K-J$ emerges from their time-conditioned disagreement. This factorization enables \textit{Transport Algebra}: algebraic operation on learned $\{(J_n,K_n)\}_{n=1}^N$ heads for compositional control. With class-specific $K_n$ heads, GAF supports a rich family of directed transport maps between a shared base distribution and multiple modalities, enabling controllable interpolation, hybrid generation, and semantic morphing through vector arithmetic. We achieve strong sample quality (FID 7.5 on CelebA-HQ $64\times 64$) while uniquely providing compositional generation as an architectural primitive. We further demonstrate, GAF has lossless cyclic transport between its initial and final state with LPIPS=$0.0$. Code available at https://github.com/IDLabMedia/GAF

Generative Anchored Fields: Controlled Data Generation via Emergent Velocity Fields and Transport Algebra

TL;DR

<3-5 sentence high-level summary> Generative Anchored Fields (GAF) reframes data generation by learning independent endpoint predictors anchored to noise and data endpoints, rather than a single trajectory model. The emergent velocity field v = K - J arises from the time-conditioned disagreement between these endpoints, enabling Transport Algebra for compositional generation across classes and modalities. Through a modular trunk and per-class K heads, GAF achieves competitive sample quality and introduces lossless cyclic transport (LPIPS = 0) and precise semantic manipulation as intrinsic architectural primitives. This framework opens avenues for intrinsic, deterministic, and scalable compositional generation, with potential extensions to sequential domains such as video through Motion Algebra.

Abstract

We present Generative Anchored Fields (GAF), a generative model that learns independent endpoint predictors (noise) and (data) rather than a trajectory predictor. The velocity field emerges from their time-conditioned disagreement. This factorization enables \textit{Transport Algebra}: algebraic operation on learned heads for compositional control. With class-specific heads, GAF supports a rich family of directed transport maps between a shared base distribution and multiple modalities, enabling controllable interpolation, hybrid generation, and semantic morphing through vector arithmetic. We achieve strong sample quality (FID 7.5 on CelebA-HQ ) while uniquely providing compositional generation as an architectural primitive. We further demonstrate, GAF has lossless cyclic transport between its initial and final state with LPIPS=. Code available at https://github.com/IDLabMedia/GAF

Paper Structure

This paper contains 36 sections, 24 equations, 13 figures, 1 table.

Figures (13)

  • Figure 1: Generated samples demonstrating our transport algebra framework. (a) Selected samples from CelebA-HQ ($256\times 256$ px), demonstrating high-fidelity generation across diverse demographics and attributes. (b) Lossless cyclic transport across three domains: $v_{\text{cat}_0}\to v_{\text{dog}}\to v_{\text{wild}}\to v_{\text{cat}_0}$ with LPIPS=$0.0$. (c) Barycentric interpolation: $v_{i\to j \to k} = \alpha v_i + \beta v_j + \gamma v_k$, showing smooth multi-domain transitions in velocity space.
  • Figure 2: The high-level overview of GAF architecture and its data flow.
  • Figure 3: Visualization of the swap operator $\mathcal{S}$ acting on a bridge. (A) The initial configuration of the bridge \ref{['eq:bridge']}. Subfigures (B), (C), and (D) illustrate the results of applying the operations described in equations \ref{['eq:pt1']}, \ref{['eq:pt2']}, and \ref{['eq:pt3']}, respectively.
  • Figure 4: Three $J/K$ pairing topologies: (A) one-to-one, (B) star (one-to-many), and (C) clustered. Each topology shows different ways to map the noise endpoint $J$ to the data endpoint $K$.
  • Figure 5: GAF architecture. The DiT-Trunk block follows the architecture of williampeebles2023.
  • ...and 8 more figures