Fusing Dictionary Learning and Support Vector Machines for Unsupervised Anomaly Detection

Paul Irofti; Iulian-Andrei Hîji; Andrei Pătraşcu; Nicolae Cleju

Fusing Dictionary Learning and Support Vector Machines for Unsupervised Anomaly Detection

Paul Irofti, Iulian-Andrei Hîji, Andrei Pătraşcu, Nicolae Cleju

TL;DR

This work tackles unsupervised anomaly detection by unifying dictionary learning (DL) with one-class SVM (OC-SVM) into a single composite objective, enabling sparse representations to directly inform the OC-SVM decision boundary. The authors derive K-SVD-type alternating updates for both standard DL and Dictionary Pair Learning (DPL), extend the framework to kernelized versions, and provide convergence results and parametric analysis. Empirical results on diverse outlier datasets show that DL–OCSVM and its kernel variants achieve higher or competitive balanced accuracy compared to OC-SVM, LOF, Isolation Forest, and autoencoders, underscoring the practical value of coupling sparse representations with one-class classification. The proposed approach offers a flexible, theoretically grounded path for kernelized, sparsity-driven anomaly detection with strong performance and extensibility to non-linear settings.

Abstract

We study in this paper the improvement of one-class support vector machines (OC-SVM) through sparse representation techniques for unsupervised anomaly detection. As Dictionary Learning (DL) became recently a common analysis technique that reveals hidden sparse patterns of data, our approach uses this insight to endow unsupervised detection with more control on pattern finding and dimensions. We introduce a new anomaly detection model that unifies the OC-SVM and DL residual functions into a single composite objective, subsequently solved through K-SVD-type iterative algorithms. A closed-form of the alternating K-SVD iteration is explicitly derived for the new composite model and practical implementable schemes are discussed. The standard DL model is adapted for the Dictionary Pair Learning (DPL) context, where the usual sparsity constraints are naturally eliminated. Finally, we extend both objectives to the more general setting that allows the use of kernel functions. The empirical convergence properties of the resulting algorithms are provided and an in-depth analysis of their parametrization is performed while also demonstrating their numerical performance in comparison with existing methods.

Fusing Dictionary Learning and Support Vector Machines for Unsupervised Anomaly Detection

TL;DR

Abstract

Paper Structure (18 sections, 5 theorems, 37 equations, 4 figures, 6 tables, 9 algorithms)

This paper contains 18 sections, 5 theorems, 37 equations, 4 figures, 6 tables, 9 algorithms.

Introduction
Contributions.
Outline
Notations
Preliminaries
Uniform support for unsupervised anomaly detection
Supervised pair learning with vector machines
Related work
Anomaly Detection Methodology and Algorithms
Uniform representations through regularization and OC-SVM
Standard DL with L21 regularization
Adapting the DPL formulation
Kernel formulation
Kernel Dictionary Learning
Kernel DPL
...and 3 more sections

Key Result

Theorem 1

Let $\phi(z) = z$ and $\nu$ the element of $\omega$ corresponding to the current atom $d_i$. Then the closed form solution of the K-SVD iteration for K-SVD_iteration is given by: let $d^* = \arg\min \limits_{\left\lVert v\right\rVert = 1} \; -\frac{1}{2}\left\lVert R^Tv + \nu\lambda\right\rVert^2_2 $(ii)$ Otherwise, when $\left\lVert R^Td^* + \nu\lambda\right\rVert < \beta$, $( d_i^+,(x^i)^+) = (

Figures (4)

Figure 1: (Left) Convergence on shuttle dataset: each point is an inner iteration, vertical lines separate outer iterations. (Right) Outer iterations convergence analysis of total error $\mathcal{L}$ (first), DL error $F$ (second), OC-SVM error $G$ (third) and BA variation (fourth).
Figure 2: Mean (left) and maximum (right) BA for different values of $\beta$ on multiple datasets.
Figure 3: Grid-search for $\beta$ and $\gamma$ on lympho dataset with the resulting BA mean (left) and maximum (right).
Figure 4: BA obtained in first scenario for different no. of outliers used for training + prediction

Theorems & Definitions (12)

Theorem 1
proof
Remark 2
Proposition 3
proof
Remark 4
Theorem 5
proof
Corollary 6
proof
...and 2 more

Fusing Dictionary Learning and Support Vector Machines for Unsupervised Anomaly Detection

TL;DR

Abstract

Fusing Dictionary Learning and Support Vector Machines for Unsupervised Anomaly Detection

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (12)