Fusing Dictionary Learning and Support Vector Machines for Unsupervised Anomaly Detection
Paul Irofti, Iulian-Andrei Hîji, Andrei Pătraşcu, Nicolae Cleju
TL;DR
This work tackles unsupervised anomaly detection by unifying dictionary learning (DL) with one-class SVM (OC-SVM) into a single composite objective, enabling sparse representations to directly inform the OC-SVM decision boundary. The authors derive K-SVD-type alternating updates for both standard DL and Dictionary Pair Learning (DPL), extend the framework to kernelized versions, and provide convergence results and parametric analysis. Empirical results on diverse outlier datasets show that DL–OCSVM and its kernel variants achieve higher or competitive balanced accuracy compared to OC-SVM, LOF, Isolation Forest, and autoencoders, underscoring the practical value of coupling sparse representations with one-class classification. The proposed approach offers a flexible, theoretically grounded path for kernelized, sparsity-driven anomaly detection with strong performance and extensibility to non-linear settings.
Abstract
We study in this paper the improvement of one-class support vector machines (OC-SVM) through sparse representation techniques for unsupervised anomaly detection. As Dictionary Learning (DL) became recently a common analysis technique that reveals hidden sparse patterns of data, our approach uses this insight to endow unsupervised detection with more control on pattern finding and dimensions. We introduce a new anomaly detection model that unifies the OC-SVM and DL residual functions into a single composite objective, subsequently solved through K-SVD-type iterative algorithms. A closed-form of the alternating K-SVD iteration is explicitly derived for the new composite model and practical implementable schemes are discussed. The standard DL model is adapted for the Dictionary Pair Learning (DPL) context, where the usual sparsity constraints are naturally eliminated. Finally, we extend both objectives to the more general setting that allows the use of kernel functions. The empirical convergence properties of the resulting algorithms are provided and an in-depth analysis of their parametrization is performed while also demonstrating their numerical performance in comparison with existing methods.
