Table of Contents
Fetching ...

ManifoldGD: Training-Free Hierarchical Manifold Guidance for Diffusion-Based Dataset Distillation

Ayush Roy, Wei-Yang Alex Lee, Rudrasis Chakraborty, Vishnu Suresh Lokhande

TL;DR

Manifold-Guided Distillation (ManifoldGD), a training-free diffusion-based framework that integrates manifold consistent guidance at every denoising timestep, is proposed, establishing ManifoldGD as the first geometry-aware training-free data distillation framework.

Abstract

In recent times, large datasets hinder efficient model training while also containing redundant concepts. Dataset distillation aims to synthesize compact datasets that preserve the knowledge of large-scale training sets while drastically reducing storage and computation. Recent advances in diffusion models have enabled training-free distillation by leveraging pre-trained generative priors; however, existing guidance strategies remain limited. Current score-based methods either perform unguided denoising or rely on simple mode-based guidance toward instance prototype centroids (IPC centroids), which often are rudimentary and suboptimal. We propose Manifold-Guided Distillation (ManifoldGD), a training-free diffusion-based framework that integrates manifold consistent guidance at every denoising timestep. Our method employs IPCs computed via a hierarchical, divisive clustering of VAE latent features, yielding a multi-scale coreset of IPCs that captures both coarse semantic modes and fine intra-class variability. Using a local neighborhood of the extracted IPC centroids, we create the latent manifold for each diffusion denoising timestep. At each denoising step, we project the mode-alignment vector onto the local tangent space of the estimated latent manifold, thus constraining the generation trajectory to remain manifold-faithful while preserving semantic consistency. This formulation improves representativeness, diversity, and image fidelity without requiring any model retraining. Empirical results demonstrate consistent gains over existing training-free and training-based baselines in terms of FID, l2 distance among real and synthetic dataset embeddings, and classification accuracy, establishing ManifoldGD as the first geometry-aware training-free data distillation framework.

ManifoldGD: Training-Free Hierarchical Manifold Guidance for Diffusion-Based Dataset Distillation

TL;DR

Manifold-Guided Distillation (ManifoldGD), a training-free diffusion-based framework that integrates manifold consistent guidance at every denoising timestep, is proposed, establishing ManifoldGD as the first geometry-aware training-free data distillation framework.

Abstract

In recent times, large datasets hinder efficient model training while also containing redundant concepts. Dataset distillation aims to synthesize compact datasets that preserve the knowledge of large-scale training sets while drastically reducing storage and computation. Recent advances in diffusion models have enabled training-free distillation by leveraging pre-trained generative priors; however, existing guidance strategies remain limited. Current score-based methods either perform unguided denoising or rely on simple mode-based guidance toward instance prototype centroids (IPC centroids), which often are rudimentary and suboptimal. We propose Manifold-Guided Distillation (ManifoldGD), a training-free diffusion-based framework that integrates manifold consistent guidance at every denoising timestep. Our method employs IPCs computed via a hierarchical, divisive clustering of VAE latent features, yielding a multi-scale coreset of IPCs that captures both coarse semantic modes and fine intra-class variability. Using a local neighborhood of the extracted IPC centroids, we create the latent manifold for each diffusion denoising timestep. At each denoising step, we project the mode-alignment vector onto the local tangent space of the estimated latent manifold, thus constraining the generation trajectory to remain manifold-faithful while preserving semantic consistency. This formulation improves representativeness, diversity, and image fidelity without requiring any model retraining. Empirical results demonstrate consistent gains over existing training-free and training-based baselines in terms of FID, l2 distance among real and synthetic dataset embeddings, and classification accuracy, establishing ManifoldGD as the first geometry-aware training-free data distillation framework.
Paper Structure (21 sections, 11 equations, 16 figures, 10 tables, 1 algorithm)

This paper contains 21 sections, 11 equations, 16 figures, 10 tables, 1 algorithm.

Figures (16)

  • Figure 1: Manifold Guidance. Overall denoising trajectory correction by manifold guidance ($g^t_\mathrm{manifold}$) to recitfy the off manifold component of mode guidance ($g^t_\mathrm{mode}$). $\mathcal{M}_\mathrm{data}$ is mebedded inside $\mathcal{M}_{t}$ and removing noise $\epsilon_t$ from $\mathcal{M}_{t}$ transforms it to $\mathcal{M}_{t-1}$. As $t \to 0$, $\mathcal{M}_{t} \to \mathcal{M}_\mathrm{data}$.
  • Figure 2: Qualitative samples generated by DiT peebles2023dit, MGD chan2025mgd3, and ManifoldGD. The samples generated by ManifoldGD have better image structure and quality (eg. dog image of MGD is having legs in unusual position, dog image generated by DiT is blurry. Similarly, the buildings have uncommon structure for MGD. The ball image generated by DiT is of poor quality).
  • Figure 3: Qualitative evolution of generated samples across denoising timesteps ($t$). Comparison between MGD chan2025mgd3 and our ManifoldGD shows that early timesteps ($t{>}25$) capture coarse semantic structure via mode guidance, while later stages ($t{\leq}20$) refine geometry and texture under manifold constraints. ManifoldGD consistently yields sharper, more coherent generations with enhanced semantic fidelity and contrast. The zoomed insets (red boxes) highlight that MGD chan2025mgd3 often blurs key regions (e.g., missing eyes and unclear nose in the dog, coarse floor textures in the church), whereas ManifoldGD preserves fine-grained (e.g., different reflections from chairs and colored windows) lighting variations, reflections, and high-frequency details, demonstrating geometrically consistent and visually realistic synthesis.
  • Figure 4: $\ell_2$ and MMD comparison. DiT peebles2023dit achieves inferior results ($\uparrow$$\ell_2$ MMD) and while ManifoldGD achieves the best result ($\downarrow$ L2 MMD).
  • Figure 5: FID, Representativeness and Diversity comparison of DiT peebles2023dit, MGD chan2025mgd3, and ManifoldGD. IPC 10,20, and 50 are used for ImageNet-100 and ImageNette. ManifoldGD achieves lower FID (% drop over MGD chan2025mgd3 marked above the bars), higher representativeness and diversity (R=representativeness, D=diversity, % increase over MGD chan2025mgd3 shown in the plot) across all settings.
  • ...and 11 more figures

Theorems & Definitions (3)

  • Definition 1: Tangent Score Function
  • Definition 2: Normal Space
  • Remark 1