Table of Contents
Fetching ...

Delta Sampling: Data-Free Knowledge Transfer Across Diffusion Models

Zhidong Gao, Zimeng Pan, Yuhang Yao, Chenyue Xie, Wei Wei

TL;DR

Delta Sampling addresses cross-version reuse of diffusion-model adaptations without training data. It derives a residual delta between an adapted base model and the base, and injects this delta into a target model's denoising predictions during sampling, enabling data-free knowledge transfer. The method is plug-and-play, sampler-agnostic, and supports multiple adapters (LoRA, LyCORIS, ControlNet, IP-Adapter) and composed configurations. Empirical results across SD backbones (1.5 to 3.5, XL) show improved fidelity to target adaptations while maintaining diversity, demonstrating practical deployment benefits for community-driven diffusion ecosystems.

Abstract

Diffusion models like Stable Diffusion (SD) drive a vibrant open-source ecosystem including fully fine-tuned checkpoints and parameter-efficient adapters such as LoRA, LyCORIS, and ControlNet. However, these adaptation components are tightly coupled to a specific base model, making them difficult to reuse when the base model is upgraded (e.g., from SD 1.x to 2.x) due to substantial changes in model parameters and architecture. In this work, we propose Delta Sampling (DS), a novel method that enables knowledge transfer across base models with different architectures, without requiring access to the original training data. DS operates entirely at inference time by leveraging the delta: the difference in model predictions before and after the adaptation of a base model. This delta is then used to guide the denoising process of a new base model. We evaluate DS across various SD versions, demonstrating that DS achieves consistent improvements in creating desired effects (e.g., visual styles, semantic concepts, and structures) under different sampling strategies. These results highlight DS as an effective, plug-and-play mechanism for knowledge transfer in diffusion-based image synthesis. Code:~ https://github.com/Zhidong-Gao/DeltaSampling

Delta Sampling: Data-Free Knowledge Transfer Across Diffusion Models

TL;DR

Delta Sampling addresses cross-version reuse of diffusion-model adaptations without training data. It derives a residual delta between an adapted base model and the base, and injects this delta into a target model's denoising predictions during sampling, enabling data-free knowledge transfer. The method is plug-and-play, sampler-agnostic, and supports multiple adapters (LoRA, LyCORIS, ControlNet, IP-Adapter) and composed configurations. Empirical results across SD backbones (1.5 to 3.5, XL) show improved fidelity to target adaptations while maintaining diversity, demonstrating practical deployment benefits for community-driven diffusion ecosystems.

Abstract

Diffusion models like Stable Diffusion (SD) drive a vibrant open-source ecosystem including fully fine-tuned checkpoints and parameter-efficient adapters such as LoRA, LyCORIS, and ControlNet. However, these adaptation components are tightly coupled to a specific base model, making them difficult to reuse when the base model is upgraded (e.g., from SD 1.x to 2.x) due to substantial changes in model parameters and architecture. In this work, we propose Delta Sampling (DS), a novel method that enables knowledge transfer across base models with different architectures, without requiring access to the original training data. DS operates entirely at inference time by leveraging the delta: the difference in model predictions before and after the adaptation of a base model. This delta is then used to guide the denoising process of a new base model. We evaluate DS across various SD versions, demonstrating that DS achieves consistent improvements in creating desired effects (e.g., visual styles, semantic concepts, and structures) under different sampling strategies. These results highlight DS as an effective, plug-and-play mechanism for knowledge transfer in diffusion-based image synthesis. Code:~ https://github.com/Zhidong-Gao/DeltaSampling

Paper Structure

This paper contains 40 sections, 14 equations, 18 figures, 1 algorithm.

Figures (18)

  • Figure 1: Overview of Delta Sampling. Each denoising step comprises the following steps: 1) The base pre-trained diffusion model predicts the noise $\epsilon_{\text{base}}(x_t,t)$ ; 2) The adapted diffusion model (Full fine-tune, LoRA, LyCORIS, ControlNet, etc.) predicts the noise $\epsilon_{\text{adapt}}(x_t,t)$; 3) The delta is computed as $\epsilon_{\text{adapt}}(x_t,t) - \epsilon_{\text{base}}(x_t,t)$; 4) The target pre-trained diffusion model predicts the noise $\epsilon_{\text{target}}(x_t,t)$; and 5) The delta is injected into $\epsilon_{\text{target}}(x_t,t)$ to guide the denoising process of target model.
  • Figure 2: DS with full-fined checkpoint, LoRA and ControlNet. From top to bottom, the results correspond to different control conditions: depth (first rows), canny edge (second rows), human pose (second rows), and segmentation (bottom rows).
  • Figure 3: We compare: (1) SD-2.1 Only (baseline, no adaptation), (2) DS applying a 32-rank, 16-alpha LoRA (DS w/ LoRA) to SD-2.1, (3) DS applying a 16-rank, 8-alpha LoHa (DS w/ LoHa) to SD-2.1, and (4) SD-1.5 with the original 32-rank, 16-alpha LoRA (baseline, representing the desired target effect). Similarity measures adherence to the target style/concept (e.g., using CLIP-Score with reference prompts/images), while diversity assesses visual variation among images generated for the same prompt (e.g., using average LPIPS between pairs).
  • Figure 4: DS with full-fined checkpoint. From top to bottom, the results correspond to three different checkpoints: LineArt.safetensors (first two rows), Photon_v1.safetensors (middle two rows), and revAnimated_v2.safetensors (bottom two rows). From left to right, each column shows the result generated with a different guidance strength $\lambda$.
  • Figure 5: Robustness of Delta Sampling (DS) across different diffusion samplers. We apply DS to transfer the animeoutlineV4_16 LoRA (trained on SD-1.5) to SD-2.1 under a wide range of samplers, including DDIM, DDPM, DDPMPP, DPM2, Euler, Gradient Estimation Sampler (GES), Heun, and Uni-PC. All generations use the same prompt describing an anime-style girl in a flower field with monochrome lineart aesthetics. Despite substantial differences in solver dynamics, numerical stability, and stochasticity, DS consistently preserves the intended line-art adaptation across all samplers, demonstrating strong sampler-agnostic robustness.
  • ...and 13 more figures