Table of Contents
Fetching ...

Infinite-dimensional Mahalanobis Distance with Applications to Kernelized Novelty Detection

Nikita Zozoulenko, Thomas Cass, Lukas Gonon

TL;DR

The paper extends the Mahalanobis distance to infinite-dimensional spaces by formulating a variance-norm based on Cameron–Martin spaces for general measures on separable Banach spaces. It introduces an extended covariance operator to handle non-injective cases and proves variance-norm invariance under invertible bounded linear transformations, with a Hilbert-space specialization linking to RKHS and kernel methods. It provides practical computations via empirical measures, derives consistency and distributional results, and develops a kernelized nearest-neighbor Mahalanobis distance with regularization. The authors validate the framework on kernelized multivariate time-series novelty detection across multiple kernels, showing competitive or superior performance to traditional kernelized Mahalanobis distances and demonstrating scalability to high-dimensional data. The work offers a unified, theoretically grounded approach for anomaly detection in infinite-dimensional settings with concrete algorithms and empirical support.

Abstract

The Mahalanobis distance is a classical tool used to measure the covariance-adjusted distance between points in $\bbR^d$. In this work, we extend the concept of Mahalanobis distance to separable Banach spaces by reinterpreting it as a Cameron-Martin norm associated with a probability measure. This approach leads to a basis-free, data-driven notion of anomaly distance through the so-called variance norm, which can naturally be estimated using empirical measures of a sample. Our framework generalizes the classical $\bbR^d$, functional $(L^2[0,1])^d$, and kernelized settings; importantly, it incorporates non-injective covariance operators. We prove that the variance norm is invariant under invertible bounded linear transformations of the data, extending previous results which are limited to unitary operators. In the Hilbert space setting, we connect the variance norm to the RKHS of the covariance operator, and establish consistency and convergence results for estimation using empirical measures with Tikhonov regularization. Using the variance norm, we introduce the notion of a kernelized nearest-neighbour Mahalanobis distance, and study some of its finite-sample concentration properties. In an empirical study on 12 real-world data sets, we demonstrate that the kernelized nearest-neighbour Mahalanobis distance outperforms the traditional kernelized Mahalanobis distance for multivariate time series novelty detection, using state-of-the-art time series kernels such as the signature, global alignment, and Volterra reservoir kernels.

Infinite-dimensional Mahalanobis Distance with Applications to Kernelized Novelty Detection

TL;DR

The paper extends the Mahalanobis distance to infinite-dimensional spaces by formulating a variance-norm based on Cameron–Martin spaces for general measures on separable Banach spaces. It introduces an extended covariance operator to handle non-injective cases and proves variance-norm invariance under invertible bounded linear transformations, with a Hilbert-space specialization linking to RKHS and kernel methods. It provides practical computations via empirical measures, derives consistency and distributional results, and develops a kernelized nearest-neighbor Mahalanobis distance with regularization. The authors validate the framework on kernelized multivariate time-series novelty detection across multiple kernels, showing competitive or superior performance to traditional kernelized Mahalanobis distances and demonstrating scalability to high-dimensional data. The work offers a unified, theoretically grounded approach for anomaly detection in infinite-dimensional settings with concrete algorithms and empirical support.

Abstract

The Mahalanobis distance is a classical tool used to measure the covariance-adjusted distance between points in . In this work, we extend the concept of Mahalanobis distance to separable Banach spaces by reinterpreting it as a Cameron-Martin norm associated with a probability measure. This approach leads to a basis-free, data-driven notion of anomaly distance through the so-called variance norm, which can naturally be estimated using empirical measures of a sample. Our framework generalizes the classical , functional , and kernelized settings; importantly, it incorporates non-injective covariance operators. We prove that the variance norm is invariant under invertible bounded linear transformations of the data, extending previous results which are limited to unitary operators. In the Hilbert space setting, we connect the variance norm to the RKHS of the covariance operator, and establish consistency and convergence results for estimation using empirical measures with Tikhonov regularization. Using the variance norm, we introduce the notion of a kernelized nearest-neighbour Mahalanobis distance, and study some of its finite-sample concentration properties. In an empirical study on 12 real-world data sets, we demonstrate that the kernelized nearest-neighbour Mahalanobis distance outperforms the traditional kernelized Mahalanobis distance for multivariate time series novelty detection, using state-of-the-art time series kernels such as the signature, global alignment, and Volterra reservoir kernels.
Paper Structure (30 sections, 18 theorems, 97 equations, 2 figures, 3 tables, 3 algorithms)

This paper contains 30 sections, 18 theorems, 97 equations, 2 figures, 3 tables, 3 algorithms.

Key Result

Lemma 2.4

Let $\mu \in \mathcal{M}_V$. The covariance operator $\mathcal{K} : V^* \to V$ is the unique operator satisfying for all $f,g\in V^*$. Moreover, $\mathcal{K}$ is compact, and in particular bounded.

Figures (2)

  • Figure 1: Optimal hyper-parameters for computing the anomaly distance as per algoGramMatrixalgoConformance, sampled across all data sets and all kernels, normalized by the number of classes per data set. The results were obtained via a repeated $k$-fold cross-validation on the train set.
  • Figure 2: Pairwise comparison of one-versus-rest test scores for the Mahalanobis distance and the Conformance score. Each point represents one kernel and one data set.

Theorems & Definitions (41)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Lemma 2.4
  • Remark 2.5
  • Definition 2.6
  • Proposition 2.7
  • Definition 2.8
  • Proposition 2.9
  • Theorem 2.10
  • ...and 31 more