Local Curvature Smoothing with Stein's Identity for Efficient Score Matching

Genki Osada; Makoto Shing; Takashi Nishide

Local Curvature Smoothing with Stein's Identity for Efficient Score Matching

Genki Osada, Makoto Shing, Takashi Nishide

TL;DR

This work targets the computational bottleneck of score-based diffusion models by addressing the Jacobian-trace term in score matching. The authors introduce Local Curvature Smoothing with Stein's identity (LCSS), which recasts the trace into an efficiently computable inner product via a Gaussian-averaged objective and Stein's identity, enabling regularization without enforcing affine SDEs. The time-conditioned LCSS objective integrates naturally into SDMs, supporting flexible forward processes and yielding faster training, stable convergence, and competitive or superior sample quality, including high-resolution generation up to $1024\times1024$. Empirically, LCSS outperforms SSM and FD-SSM in density estimation and training efficiency and matches or surpasses DSM in several qualitative and quantitative metrics, while avoiding DSM's affine-SDE constraint and associated instabilities. The method broadens the design space for SDMs by decoupling score matching from affine forward dynamics, with strong practical implications for scalable, high-fidelity image generation.

Abstract

The training of score-based diffusion models (SDMs) is based on score matching. The challenge of score matching is that it includes a computationally expensive Jacobian trace. While several methods have been proposed to avoid this computation, each has drawbacks, such as instability during training and approximating the learning as learning a denoising vector field rather than a true score. We propose a novel score matching variant, local curvature smoothing with Stein's identity (LCSS). The LCSS bypasses the Jacobian trace by applying Stein's identity, enabling regularization effectiveness and efficient computation. We show that LCSS surpasses existing methods in sample generation performance and matches the performance of denoising score matching, widely adopted by most SDMs, in evaluations such as FID, Inception score, and bits per dimension. Furthermore, we show that LCSS enables realistic image generation even at a high resolution of $1024 \times 1024$.

Local Curvature Smoothing with Stein's Identity for Efficient Score Matching

TL;DR

Abstract

Local Curvature Smoothing with Stein's Identity for Efficient Score Matching

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (6)