Sobolev Space Regularised Pre Density Models

Mark Kozdoba; Binyamin Perets; Shie Mannor

Sobolev Space Regularised Pre Density Models

Mark Kozdoba, Binyamin Perets, Shie Mannor

TL;DR

SOSREP presents a Sobolev-space regularised pre-density estimator in a reproducing kernel Hilbert space, yielding unnormalized but integrable pre-densities via $(f^*)^2$ and enabling principled, interpretable regularisation of high-dimensional densities. The method resolves non-convex optimisation with a nonnegative initialisation and natural-gradient updates, and uses a sampling-based kernel approximation for the associated SDO kernel. Consistency is proven in fixed dimensions, and hyperparameters are chosen with Fisher-divergence based score matching due to the lack of normalization. Empirically, SOSREP ranks highly on the ADBench anomaly-detection suite, showing robustness to duplicate anomalies and competitive performance without heavy task-specific tailoring, highlighting its potential as a flexible non-parametric density estimator and a basis for generative modelling.

Abstract

We propose a new approach to non-parametric density estimation that is based on regularizing a Sobolev norm of the density. This method is statistically consistent, and makes the inductive bias of the model clear and interpretable. While there is no closed analytic form for the associated kernel, we show that one can approximate it using sampling. The optimization problem needed to determine the density is non-convex, and standard gradient methods do not perform well. However, we show that with an appropriate initialization and using natural gradients, one can obtain well performing solutions. Finally, while the approach provides pre-densities (i.e. not necessarily integrating to 1), which prevents the use of log-likelihood for cross validation, we show that one can instead adapt Fisher divergence based score matching methods for this task. We evaluate the resulting method on the comprehensive recent anomaly detection benchmark suite, ADBench, and find that it ranks second best, among more than 15 algorithms.

Sobolev Space Regularised Pre Density Models

TL;DR

SOSREP presents a Sobolev-space regularised pre-density estimator in a reproducing kernel Hilbert space, yielding unnormalized but integrable pre-densities via

and enabling principled, interpretable regularisation of high-dimensional densities. The method resolves non-convex optimisation with a nonnegative initialisation and natural-gradient updates, and uses a sampling-based kernel approximation for the associated SDO kernel. Consistency is proven in fixed dimensions, and hyperparameters are chosen with Fisher-divergence based score matching due to the lack of normalization. Empirically, SOSREP ranks highly on the ADBench anomaly-detection suite, showing robustness to duplicate anomalies and competitive performance without heavy task-specific tailoring, highlighting its potential as a flexible non-parametric density estimator and a basis for generative modelling.

Abstract

Paper Structure (45 sections, 20 theorems, 101 equations, 7 figures, 6 algorithms)

This paper contains 45 sections, 20 theorems, 101 equations, 7 figures, 6 algorithms.

Introduction
Literature and Related Work
The SOSREP Desnity Estimator
The Basic Framework
Convexity, Positivity, and Natural Gradients
Explicit Form of Gradients
Single Derivative Order Kernel Approximation
The Kernel in Integral Form
Kernel Evaluation via Sampling
Relations With Other Kernels
Consistency
Experiments
Anomaly Detection Results for ADbench
Fisher-Divergence Based Hyperparameter Tuning
Natural-Gradient vs Standard Gradient Comparison
...and 30 more sections

Key Result

Lemma 1

The standard and the natural gradients of $L(f)$ are given by where for a vector $v\in \mathbb{R} ^d$, $v^{-1}$ means coordinatewise inversion.

Figures (7)

Figure 1: Anomaly Detection Results on ADBench, higher is Better. SOSREP is Second Best Among 18 Algorithms
Figure 2: For each HP a, we calculate the Fisher divergence between the density learned on the training set and the density inferred on the test set.
Figure 3: Fraction of negative values for natural versus $\alpha$ gradient-based optimization across datasets. The X-axis represents datasets from ADbench (see Supplementary Material Section \ref{['sec:datasets_ng_frc']} for details).
Figure 4: (a) Distribution of Kernel Values and SDO Kernel Values Inside and Between Clusters. (b) SOSREP and KDE Loglikelihoods. the x-axis represents points in the data, arranged by clusters, y-axis shows the log-likelihood.
Figure 5: Anomaly Detection Results on ADBench. Relative Ranking Per Dataset, Higher is Better. SOSREP is Second Best Among 18 Algorithms
...and 2 more figures

Theorems & Definitions (38)

Lemma 1: Gradients
Theorem 2
Theorem 3
Lemma 4
proof
proof : Proof Of Lemma \ref{['lem:gradients']}
Lemma 5
proof
Proposition 6
proof
...and 28 more

Sobolev Space Regularised Pre Density Models

TL;DR

Abstract

Sobolev Space Regularised Pre Density Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (38)