Table of Contents
Fetching ...

HiCache: A Plug-in Scaled-Hermite Upgrade for Taylor-Style Cache-then-Forecast Diffusion Acceleration

Liang Feng, Shikang Zheng, Jiacheng Liu, Yuqi Lin, Qinming Zhou, Peiliang Cai, Xinyu Wang, Junjie Chen, Chang Zou, Yue Ma, Linfeng Zhang

TL;DR

Diffusion models deliver high-fidelity content but incur heavy compute from iterative sampling. HiCache introduces a training-free Hermite-based feature caching strategy with dual-scaling, replacing the Taylor monomial basis to better capture Gaussian-like feature dynamics. It achieves large speedups (e.g., $5.55\times$ on $\text{FLUX.1-dev}$) while preserving or improving quality and is compatible as a plug-in for existing cache-then-forecast methods like TaylorSeer and ClusCa. The practical impact is reduced compute and energy for diffusion-based generation across text-to-image, text-to-video, and super-resolution tasks.

Abstract

Diffusion models have achieved remarkable success in content generation but often incur prohibitive computational costs due to iterative sampling. Recent feature caching methods accelerate inference via temporal extrapolation, yet can suffer quality degradation from inaccurate modeling of the complex dynamics of feature evolution. We propose HiCache (Hermite Polynomial-based Feature Cache), a training-free acceleration framework that improves feature prediction by aligning mathematical tools with empirical properties. Our key insight is that feature-derivative approximations in diffusion Transformers exhibit multivariate Gaussian characteristics, motivating the use of Hermite polynomials as a potentially optimal basis for Gaussian-correlated processes. We further introduce a dual-scaling mechanism that ensures numerical stability while preserving predictive accuracy, and is also effective when applied standalone or integrated with TaylorSeer. Extensive experiments demonstrate HiCache's superiority, achieving 5.55x speedup on FLUX.1-dev while matching or exceeding baseline quality, and maintaining strong performance across text-to-image, video generation, and super-resolution tasks. Moreover, HiCache can be naturally added to previous caching methods to enhance their performance, e.g., improving ClusCa from 0.9480 to 0.9840 in terms of image rewards. Code: https://github.com/fenglang918/HiCache

HiCache: A Plug-in Scaled-Hermite Upgrade for Taylor-Style Cache-then-Forecast Diffusion Acceleration

TL;DR

Diffusion models deliver high-fidelity content but incur heavy compute from iterative sampling. HiCache introduces a training-free Hermite-based feature caching strategy with dual-scaling, replacing the Taylor monomial basis to better capture Gaussian-like feature dynamics. It achieves large speedups (e.g., on ) while preserving or improving quality and is compatible as a plug-in for existing cache-then-forecast methods like TaylorSeer and ClusCa. The practical impact is reduced compute and energy for diffusion-based generation across text-to-image, text-to-video, and super-resolution tasks.

Abstract

Diffusion models have achieved remarkable success in content generation but often incur prohibitive computational costs due to iterative sampling. Recent feature caching methods accelerate inference via temporal extrapolation, yet can suffer quality degradation from inaccurate modeling of the complex dynamics of feature evolution. We propose HiCache (Hermite Polynomial-based Feature Cache), a training-free acceleration framework that improves feature prediction by aligning mathematical tools with empirical properties. Our key insight is that feature-derivative approximations in diffusion Transformers exhibit multivariate Gaussian characteristics, motivating the use of Hermite polynomials as a potentially optimal basis for Gaussian-correlated processes. We further introduce a dual-scaling mechanism that ensures numerical stability while preserving predictive accuracy, and is also effective when applied standalone or integrated with TaylorSeer. Extensive experiments demonstrate HiCache's superiority, achieving 5.55x speedup on FLUX.1-dev while matching or exceeding baseline quality, and maintaining strong performance across text-to-image, video generation, and super-resolution tasks. Moreover, HiCache can be naturally added to previous caching methods to enhance their performance, e.g., improving ClusCa from 0.9480 to 0.9840 in terms of image rewards. Code: https://github.com/fenglang918/HiCache

Paper Structure

This paper contains 55 sections, 20 theorems, 38 equations, 9 figures, 9 tables, 3 algorithms.

Key Result

Proposition 1

For feature trajectories with bounded variation but containing turning points, the Taylor prediction error grows as: where the supremum can be arbitrarily large at trajectory inflection points.

Figures (9)

  • Figure 1: Teaser of HiCache using the FLUX.1-dev model, with a $\boldsymbol{6.24\times}$ FLOPs speedup .
  • Figure 2: Trajectory prediction comparison of Taylor and HiCache methods on FLUX."Full trajectory" indicates the feature trajectory from the original Flux esser2024scaling in the 14th and 28th layer. The y-axis denotes the most principal component of the features in diffusion models.
  • Figure 3: Method overview and basis function comparison. (a) TaylorSeer (orange) predicts features using a power basis, whereas HiCache (green) keeps the functional form but swaps the basis for scaled Hermite polynomials. (b) Hermite's oscillatory behavior (e.g., $H_2(x)$ negative offset) captures non-monotonic evolution better than monotonic Taylor growth.
  • Figure 4: Qualitative comparison on the text-to-image task across diverse prompts.
  • Figure 5: Detail retention and style consistency. (a) Superior detail retention. (b) Greater stability than TaylorSeer under higher acceleration, with consistent style and clean outputs.
  • ...and 4 more figures

Theorems & Definitions (32)

  • Definition 1: Finite Difference Operator
  • Proposition 1: Limitation of Monomial Basis
  • Proposition 2: Gaussianity of Feature Differences
  • Corollary 1: Optimal Basis Selection
  • Definition 2: Scaled Hermite Basis
  • Proposition 3: HiCache Feature Prediction
  • Lemma 1: Truncation Error Bound
  • proof
  • Remark 1: Comparison with Taylor Expansion
  • Lemma 2: Finite Difference Approximation Error
  • ...and 22 more