Boosting Anomaly Detection Using Unsupervised Diverse Test-Time Augmentation
Seffi Cohen, Niv Goldshlager, Lior Rokach, Bracha Shapira
TL;DR
This work addresses the challenge of anomaly detection in tabular data without labeled anomalies by introducing TTAD, a test-time augmentation framework. TTAD consists of a neighbor-based data selector that uses a learned distance metric from a Siamese network and two augmentation producers (k-Means centroids and SMOTE) to generate diverse, in-distribution test augmentations; the augmented samples are scored by a detector and the results are aggregated. The approach yields consistent AUC improvements over baselines across eight ODDS datasets, with the learned metric and k-Means augmentation often providing the strongest gains, while Gaussian-noise TTA can be detrimental. Practically, TTAD offers a training-free, efficient enhancement to tabular anomaly detection that leverages unsupervised learning to improve robustness and accuracy.
Abstract
Anomaly detection is a well-known task that involves the identification of abnormal events that occur relatively infrequently. Methods for improving anomaly detection performance have been widely studied. However, no studies utilizing test-time augmentation (TTA) for anomaly detection in tabular data have been performed. TTA involves aggregating the predictions of several synthetic versions of a given test sample; TTA produces different points of view for a specific test instance and might decrease its prediction bias. We propose the Test-Time Augmentation for anomaly Detection (TTAD) technique, a TTA-based method aimed at improving anomaly detection performance. TTAD augments a test instance based on its nearest neighbors; various methods, including the k-Means centroid and SMOTE methods, are used to produce the augmentations. Our technique utilizes a Siamese network to learn an advanced distance metric when retrieving a test instance's neighbors. Our experiments show that the anomaly detector that uses our TTA technique achieved significantly higher AUC results on all datasets evaluated.
