Table of Contents
Fetching ...

2ndMatch: Finetuning Pruned Diffusion Models via Second-Order Jacobian Matching

Caleb Zheng, Eli Shlizerman

Abstract

Diffusion models achieve remarkable performance across diverse generative tasks in computer vision, but their high computational cost remains a major barrier to deployment. Model pruning offers a promising way to reduce inference cost and enable lightweight models. However, pruning leads to quality drop due to reduced capacity. A key limitation of existing pruning approaches is that pruned models are finetuned using the same objective as the dense model (denoising score matching). Since the dense model is accessible during finetuning, it warrants a more effective approach for knowledge transfer from the dense to the pruned model. Motivated by this, we propose \textbf{2ndMatch} (\textbf{2ndM}), a general-purpose finetuning framework that introduces a \textbf{2nd}-order Jacobian ($J^{\top} J$) \textbf{M}atching loss inspired by Finite-Time Lyapunov Exponents. \textbf{2ndM} teaches the pruned model to mimic the sensitivity of the dense teacher, i.e., how to respond to small perturbations over time, through scalable random projections. The framework is architecture-agnostic and applies to both U-Net- and Transformer-based diffusion models. Experiments on CIFAR-10, CelebA, LSUN, ImageNet, and MSCOCO demonstrate that \textbf{2ndM} reduces the performance gap between pruned and dense models, substantially improving output quality.

2ndMatch: Finetuning Pruned Diffusion Models via Second-Order Jacobian Matching

Abstract

Diffusion models achieve remarkable performance across diverse generative tasks in computer vision, but their high computational cost remains a major barrier to deployment. Model pruning offers a promising way to reduce inference cost and enable lightweight models. However, pruning leads to quality drop due to reduced capacity. A key limitation of existing pruning approaches is that pruned models are finetuned using the same objective as the dense model (denoising score matching). Since the dense model is accessible during finetuning, it warrants a more effective approach for knowledge transfer from the dense to the pruned model. Motivated by this, we propose \textbf{2ndMatch} (\textbf{2ndM}), a general-purpose finetuning framework that introduces a \textbf{2nd}-order Jacobian () \textbf{M}atching loss inspired by Finite-Time Lyapunov Exponents. \textbf{2ndM} teaches the pruned model to mimic the sensitivity of the dense teacher, i.e., how to respond to small perturbations over time, through scalable random projections. The framework is architecture-agnostic and applies to both U-Net- and Transformer-based diffusion models. Experiments on CIFAR-10, CelebA, LSUN, ImageNet, and MSCOCO demonstrate that \textbf{2ndM} reduces the performance gap between pruned and dense models, substantially improving output quality.

Paper Structure

This paper contains 22 sections, 9 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Examples of improvement in preserving fidelity and consistency of output images compared to baseline (BK-SDM) on pruning Stable Diffusion (SD) 1.4, Dense teacher, to 33%.
  • Figure 2: Sketch illustration of the difference between pruned Baseline (orange) and our 2ndM (purple). Both Baseline and 2ndM match the dense teacher (black) on training samples (green). While 2ndM constraints the expansion/contraction behavior of nearby points (red and blue), the Baseline does not have such constraints. Due to these, the Baseline accumulates drifts along the denoising trajectory, leading to deviations at $t = T$. In contrast, 2ndM preserves stable, teacher-aligned trajectories, resulting in higher-fidelity generation.
  • Figure 3: Per-class rFID and qualitative comparison between Dense, 2ndMatch, and baseline finetuning (DP) on ImageNet 256×256.