Sliced Score Matching: A Scalable Approach to Density and Score Estimation

Yang Song; Sahaj Garg; Jiaxin Shi; Stefano Ermon

Sliced Score Matching: A Scalable Approach to Density and Score Estimation

Yang Song, Sahaj Garg, Jiaxin Shi, Stefano Ermon

TL;DR

This work introduces sliced score matching (SSM), a scalable variant of score matching that projects high-dimensional scores onto random directions to avoid computing Hessian traces. By using Hessian–vector products and Hutchinson-style trace estimation, SSM enables training deep unnormalized models and estimating scores for implicit distributions with theoretical guarantees of consistency and asymptotic normality. The framework yields accurate score estimators and improves density estimation efficiency, with practical success on deep kernel exponential families and NICE flows, and enhances score-based learning for VAEs and WAEs. Empirically, SSM and its variance-reduced variant outperform existing scalable alternatives across density estimation and score estimation tasks, demonstrating strong performance and scalability in high-dimensional settings.

Abstract

Score matching is a popular method for estimating unnormalized statistical models. However, it has been so far limited to simple, shallow models or low-dimensional data, due to the difficulty of computing the Hessian of log-density functions. We show this difficulty can be mitigated by projecting the scores onto random vectors before comparing them. This objective, called sliced score matching, only involves Hessian-vector products, which can be easily implemented using reverse-mode automatic differentiation. Therefore, sliced score matching is amenable to more complex models and higher dimensional data compared to score matching. Theoretically, we prove the consistency and asymptotic normality of sliced score matching estimators. Moreover, we demonstrate that sliced score matching can be used to learn deep score estimators for implicit distributions. In our experiments, we show sliced score matching can learn deep energy-based models effectively, and can produce accurate score estimates for applications such as variational inference with implicit distributions and training Wasserstein Auto-Encoders.

Sliced Score Matching: A Scalable Approach to Density and Score Estimation

TL;DR

Abstract

Sliced Score Matching: A Scalable Approach to Density and Score Estimation

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (27)