Table of Contents
Fetching ...

Sobolev Space Regularised Pre Density Models

Mark Kozdoba, Binyamin Perets, Shie Mannor

TL;DR

SOSREP presents a Sobolev-space regularised pre-density estimator in a reproducing kernel Hilbert space, yielding unnormalized but integrable pre-densities via $(f^*)^2$ and enabling principled, interpretable regularisation of high-dimensional densities. The method resolves non-convex optimisation with a nonnegative initialisation and natural-gradient updates, and uses a sampling-based kernel approximation for the associated SDO kernel. Consistency is proven in fixed dimensions, and hyperparameters are chosen with Fisher-divergence based score matching due to the lack of normalization. Empirically, SOSREP ranks highly on the ADBench anomaly-detection suite, showing robustness to duplicate anomalies and competitive performance without heavy task-specific tailoring, highlighting its potential as a flexible non-parametric density estimator and a basis for generative modelling.

Abstract

We propose a new approach to non-parametric density estimation that is based on regularizing a Sobolev norm of the density. This method is statistically consistent, and makes the inductive bias of the model clear and interpretable. While there is no closed analytic form for the associated kernel, we show that one can approximate it using sampling. The optimization problem needed to determine the density is non-convex, and standard gradient methods do not perform well. However, we show that with an appropriate initialization and using natural gradients, one can obtain well performing solutions. Finally, while the approach provides pre-densities (i.e. not necessarily integrating to 1), which prevents the use of log-likelihood for cross validation, we show that one can instead adapt Fisher divergence based score matching methods for this task. We evaluate the resulting method on the comprehensive recent anomaly detection benchmark suite, ADBench, and find that it ranks second best, among more than 15 algorithms.

Sobolev Space Regularised Pre Density Models

TL;DR

SOSREP presents a Sobolev-space regularised pre-density estimator in a reproducing kernel Hilbert space, yielding unnormalized but integrable pre-densities via and enabling principled, interpretable regularisation of high-dimensional densities. The method resolves non-convex optimisation with a nonnegative initialisation and natural-gradient updates, and uses a sampling-based kernel approximation for the associated SDO kernel. Consistency is proven in fixed dimensions, and hyperparameters are chosen with Fisher-divergence based score matching due to the lack of normalization. Empirically, SOSREP ranks highly on the ADBench anomaly-detection suite, showing robustness to duplicate anomalies and competitive performance without heavy task-specific tailoring, highlighting its potential as a flexible non-parametric density estimator and a basis for generative modelling.

Abstract

We propose a new approach to non-parametric density estimation that is based on regularizing a Sobolev norm of the density. This method is statistically consistent, and makes the inductive bias of the model clear and interpretable. While there is no closed analytic form for the associated kernel, we show that one can approximate it using sampling. The optimization problem needed to determine the density is non-convex, and standard gradient methods do not perform well. However, we show that with an appropriate initialization and using natural gradients, one can obtain well performing solutions. Finally, while the approach provides pre-densities (i.e. not necessarily integrating to 1), which prevents the use of log-likelihood for cross validation, we show that one can instead adapt Fisher divergence based score matching methods for this task. We evaluate the resulting method on the comprehensive recent anomaly detection benchmark suite, ADBench, and find that it ranks second best, among more than 15 algorithms.
Paper Structure (45 sections, 20 theorems, 101 equations, 7 figures, 6 algorithms)

This paper contains 45 sections, 20 theorems, 101 equations, 7 figures, 6 algorithms.

Key Result

Lemma 1

The standard and the natural gradients of $L(f)$ are given by where for a vector $v\in \mathbb{R} ^d$, $v^{-1}$ means coordinatewise inversion.

Figures (7)

  • Figure 1: Anomaly Detection Results on ADBench, higher is Better. SOSREP is Second Best Among 18 Algorithms
  • Figure 2: For each HP a, we calculate the Fisher divergence between the density learned on the training set and the density inferred on the test set.
  • Figure 3: Fraction of negative values for natural versus $\alpha$ gradient-based optimization across datasets. The X-axis represents datasets from ADbench (see Supplementary Material Section \ref{['sec:datasets_ng_frc']} for details).
  • Figure 4: (a) Distribution of Kernel Values and SDO Kernel Values Inside and Between Clusters. (b) SOSREP and KDE Loglikelihoods. the x-axis represents points in the data, arranged by clusters, y-axis shows the log-likelihood.
  • Figure 5: Anomaly Detection Results on ADBench. Relative Ranking Per Dataset, Higher is Better. SOSREP is Second Best Among 18 Algorithms
  • ...and 2 more figures

Theorems & Definitions (38)

  • Lemma 1: Gradients
  • Theorem 2
  • Theorem 3
  • Lemma 4
  • proof
  • proof : Proof Of Lemma \ref{['lem:gradients']}
  • Lemma 5
  • proof
  • Proposition 6
  • proof
  • ...and 28 more