Table of Contents
Fetching ...

Riemannian-Geometric Fingerprints of Generative Models

Hae Jin Song, Laurent Itti

TL;DR

This work introduces a principled Riemannian-geometry framework to fingerprint and attribute generative models on non-Euclidean data. By learning a latent Riemannian metric via a VAE and using geodesic distances and a gradient-based Riemannian center of mass, it defines artifacts and fingerprints that robustly distinguish a wide range of generative methods across multiple datasets and modalities. Empirically, the proposed Riemannian fingerprints outperform Euclidean baselines and show strong generalization to unseen datasets, model types, and vision-language models, with a ResNet-based classifier leveraging artifact features for attribution. The approach offers a practical, geometry-aware toolkit for model provenance and IP protection in real-world, multimodal settings.

Abstract

Recent breakthroughs and rapid integration of generative models (GMs) have sparked interest in the problem of model attribution and their fingerprints. For instance, service providers need reliable methods of authenticating their models to protect their IP, while users and law enforcement seek to verify the source of generated content for accountability and trust. In addition, a growing threat of model collapse is arising, as more model-generated data are being fed back into sources (e.g., YouTube) that are often harvested for training ("regurgitative training"), heightening the need to differentiate synthetic from human data. Yet, a gap still exists in understanding generative models' fingerprints, we believe, stemming from the lack of a formal framework that can define, represent, and analyze the fingerprints in a principled way. To address this gap, we take a geometric approach and propose a new definition of artifact and fingerprint of GMs using Riemannian geometry, which allows us to leverage the rich theory of differential geometry. Our new definition generalizes previous work (Song et al., 2024) to non-Euclidean manifolds by learning Riemannian metrics from data and replacing the Euclidean distances and nearest-neighbor search with geodesic distances and kNN-based Riemannian center of mass. We apply our theory to a new gradient-based algorithm for computing the fingerprints in practice. Results show that it is more effective in distinguishing a large array of GMs, spanning across 4 different datasets in 2 different resolutions (64 by 64, 256 by 256), 27 model architectures, and 2 modalities (Vision, Vision-Language). Using our proposed definition significantly improves the performance on model attribution, as well as a generalization to unseen datasets, model types, and modalities, suggesting its practical efficacy.

Riemannian-Geometric Fingerprints of Generative Models

TL;DR

This work introduces a principled Riemannian-geometry framework to fingerprint and attribute generative models on non-Euclidean data. By learning a latent Riemannian metric via a VAE and using geodesic distances and a gradient-based Riemannian center of mass, it defines artifacts and fingerprints that robustly distinguish a wide range of generative methods across multiple datasets and modalities. Empirically, the proposed Riemannian fingerprints outperform Euclidean baselines and show strong generalization to unseen datasets, model types, and vision-language models, with a ResNet-based classifier leveraging artifact features for attribution. The approach offers a practical, geometry-aware toolkit for model provenance and IP protection in real-world, multimodal settings.

Abstract

Recent breakthroughs and rapid integration of generative models (GMs) have sparked interest in the problem of model attribution and their fingerprints. For instance, service providers need reliable methods of authenticating their models to protect their IP, while users and law enforcement seek to verify the source of generated content for accountability and trust. In addition, a growing threat of model collapse is arising, as more model-generated data are being fed back into sources (e.g., YouTube) that are often harvested for training ("regurgitative training"), heightening the need to differentiate synthetic from human data. Yet, a gap still exists in understanding generative models' fingerprints, we believe, stemming from the lack of a formal framework that can define, represent, and analyze the fingerprints in a principled way. To address this gap, we take a geometric approach and propose a new definition of artifact and fingerprint of GMs using Riemannian geometry, which allows us to leverage the rich theory of differential geometry. Our new definition generalizes previous work (Song et al., 2024) to non-Euclidean manifolds by learning Riemannian metrics from data and replacing the Euclidean distances and nearest-neighbor search with geodesic distances and kNN-based Riemannian center of mass. We apply our theory to a new gradient-based algorithm for computing the fingerprints in practice. Results show that it is more effective in distinguishing a large array of GMs, spanning across 4 different datasets in 2 different resolutions (64 by 64, 256 by 256), 27 model architectures, and 2 modalities (Vision, Vision-Language). Using our proposed definition significantly improves the performance on model attribution, as well as a generalization to unseen datasets, model types, and modalities, suggesting its practical efficacy.

Paper Structure

This paper contains 15 sections, 8 equations, 3 figures, 4 tables, 4 algorithms.

Figures (3)

  • Figure 1: Learn the latent Riemannian manifold from observed real data. (Left) We learn the latent geometry of data (real images) by training a VAE with mean and variance estimators and using its decoder to pullback the metric on the data space to the latent space. This pullback metric defines a Riemannian metric arvanitidis2017latent, based on which we compute geodesic distances and estimate a Riemannian center of mass (RCM) on the latent space in Alg. \ref{['alg:project']}. (Right) An illustration of RCM of four points on a manifold.
  • Figure 2: Overview of our fingerprint estimation.We learn the latent data manifold ${\mathcal{M}}$ from the dataset of real images as a Riemannian manifold equipped with the pullback metric $G$. To pullback the metric of ${\mathcal{X}}$ to the latent manifold ${\mathcal{M}}$, we train a VAE (with mean and variance estimation functions $(\theta_{\mu},\theta_{\sigma})$) on the real dataset, and define the length of a curve on ${\mathcal{M}}$ to be the length of a decoded curve on ${\mathcal{X}}$, on which we know how to measure a length (i.e. a Euclidean norm of a vector). From the length of a curve, we define the geodesic distance of two points on ${\mathcal{M}}$ as the shortest distance of a curve connecting them.
  • Figure 3: Estimating the projection of $x_G$ onto the manifold ${\mathcal{M}}$ as Riemannian center of mass of $k$-nearest neighbors. (left) Definition and fingerprint estimation proposed in song2024manifpt. Here, the projection of $z_G$ is estimated the nearest-neighbor (1-NN) in the observed real dataset, based on the standard Euclidean distance. (middle) Baseline method using $k$-nearest neighbors ($k$>1; $k$=3 in this figure): we estimate the artifact of $z_G$ by finding the $k$-nearest neighbors of $z_G$ in the real dataset using the Euclidean distance, and computing their center of mass (also in L2). (right) Our proposed method (R-gmftps: RCM) that estimates the artifact $a(x_G,{\mathcal{M}})$ using a Riemannian center of mass of $k$-nearest neighbors, based on the geodesic distances learned from data (Sec. \ref{['subsec/step1-learn-rmanifold']}). Note $z_{cm}$ does not lie on the manifold ${\mathcal{M}}$ (which is manifested on the synthetic artifacts in $z_{cm}$ (decoded) in the center), while the projection $z_{RCM}$ estimated as Riemannian center of mass (right column) does lie on ${\mathcal{M}}$, thus corresponding to an actual real image. (Background manifold image modified with permission hauberg2018only)

Theorems & Definitions (4)

  • Definition 2.1
  • Definition 3.1: Artifact
  • Definition 3.2: Fingerprint
  • Definition 3.3