Table of Contents
Fetching ...

IGG: Image Generation Informed by Geodesic Dynamics in Deformation Spaces

Nian Wu, Nivetha Jayakumar, Jiarui Xing, Miaomiao Zhang

TL;DR

IGG introduces a geometry-aware framework for image generation that deforms a template along learned geodesics in deformation spaces, guided by text prompts. It combines a geodesic-learning autoencoder for diffeomorphic velocity fields with a latent diffusion model that samples geodesic deformation sequences conditioned on image context and language. The method yields topology-preserving transformations and provides DetJac-based metrics to quantify geometric changes, outperforming state-of-the-art baselines on Komatsuna plant growth and longitudinal brain MRI data. This work advances topology-aware image synthesis with potential impact on computational anatomy, biology, and robotics by delivering anatomically faithful and editable image generations.

Abstract

Generative models have recently gained increasing attention in image generation and editing tasks. However, they often lack a direct connection to object geometry, which is crucial in sensitive domains such as computational anatomy, biology, and robotics. This paper presents a novel framework for Image Generation informed by Geodesic dynamics (IGG) in deformation spaces. Our IGG model comprises two key components: (i) an efficient autoencoder that explicitly learns the geodesic path of image transformations in the latent space; and (ii) a latent geodesic diffusion model that captures the distribution of latent representations of geodesic deformations conditioned on text instructions. By leveraging geodesic paths, our method ensures smooth, topology-preserving, and interpretable deformations, capturing complex variations in image structures while maintaining geometric consistency. We validate the proposed IGG on plant growth data and brain magnetic resonance imaging (MRI). Experimental results show that IGG outperforms the state-of-the-art image generation/editing models with superior performance in generating realistic, high-quality images with preserved object topology and reduced artifacts. Our code is publicly available at https://github.com/nellie689/IGG.

IGG: Image Generation Informed by Geodesic Dynamics in Deformation Spaces

TL;DR

IGG introduces a geometry-aware framework for image generation that deforms a template along learned geodesics in deformation spaces, guided by text prompts. It combines a geodesic-learning autoencoder for diffeomorphic velocity fields with a latent diffusion model that samples geodesic deformation sequences conditioned on image context and language. The method yields topology-preserving transformations and provides DetJac-based metrics to quantify geometric changes, outperforming state-of-the-art baselines on Komatsuna plant growth and longitudinal brain MRI data. This work advances topology-aware image synthesis with potential impact on computational anatomy, biology, and robotics by delivering anatomically faithful and editable image generations.

Abstract

Generative models have recently gained increasing attention in image generation and editing tasks. However, they often lack a direct connection to object geometry, which is crucial in sensitive domains such as computational anatomy, biology, and robotics. This paper presents a novel framework for Image Generation informed by Geodesic dynamics (IGG) in deformation spaces. Our IGG model comprises two key components: (i) an efficient autoencoder that explicitly learns the geodesic path of image transformations in the latent space; and (ii) a latent geodesic diffusion model that captures the distribution of latent representations of geodesic deformations conditioned on text instructions. By leveraging geodesic paths, our method ensures smooth, topology-preserving, and interpretable deformations, capturing complex variations in image structures while maintaining geometric consistency. We validate the proposed IGG on plant growth data and brain magnetic resonance imaging (MRI). Experimental results show that IGG outperforms the state-of-the-art image generation/editing models with superior performance in generating realistic, high-quality images with preserved object topology and reduced artifacts. Our code is publicly available at https://github.com/nellie689/IGG.

Paper Structure

This paper contains 10 sections, 8 equations, 5 figures, 1 table, 2 algorithms.

Figures (5)

  • Figure 1: An overview of the proposed IGG model.
  • Figure 2: Comparison of predicted geodesics by IGG vs. real numerical solutions from the EPDiff equation. Left: Visualization of predicted deformed images, deformations, and velocities along time. Right: Mean absolute error of predicted velocities over time compared to numerical integration of EPDiff.
  • Figure 3: A comparison of images generated by IGG (with corresponding DetJac) against all baseline models across different time frames. Given an input template image and text instructions, all models generate samples of target images. The ground truth "target" along with input template images are provided on the left side of the panel for reference.
  • Figure 4: Left to right: input template images with text instructions, followed by confidence maps illustrating the lower bounds (LB), upper bounds (UB), and confidence intervals (CI), which represent regions representing 95% of ideal growth patterns, based on 1000 samples generated by our IGG model across different time frames.
  • Figure : IGG Training