Learning to Detect Interesting Anomalies

Alireza Vafaei Sadr; Bruce A. Bassett; Emmanuel Sekyi

Learning to Detect Interesting Anomalies

Alireza Vafaei Sadr, Bruce A. Bassett, Emmanuel Sekyi

TL;DR

AHUNT addresses the challenge of detecting interesting anomalies that have not been seen by leveraging a dynamic feature space learned through active learning. By iteratively labeling strategically chosen examples and retraining a CNN, AHUNT evolves the latent representation and grows an anomaly taxonomy with a reserve class, enabling personalized rankings of anomaly classes. Across MNIST, CIFAR-10, and DESI, it outperforms static feature spaces and traditional anomaly detectors, demonstrating robust gains and adaptable handling of changing user interests. This approach offers a scalable, user-guided path to discovering meaningful anomalies in large, diverse datasets such as astronomical surveys.

Abstract

Anomaly detection algorithms are typically applied to static, unchanging, data features hand-crafted by the user. But how does a user systematically craft good features for anomalies that have never been seen? Here we couple deep learning with active learning -- in which an Oracle iteratively labels small amounts of data selected algorithmically over a series of rounds -- to automatically and dynamically improve the data features for efficient outlier detection. This approach, AHUNT, shows excellent performance on MNIST, CIFAR10, and Galaxy-DESI data, significantly outperforming both standard anomaly detection and active learning algorithms with static feature spaces. Beyond improved performance, AHUNT also allows the number of anomaly classes to grow organically in response to Oracle's evaluations. Extensive ablation studies explore the impact of Oracle question selection strategy and loss function on performance. We illustrate how the dynamic anomaly class taxonomy represents another step towards fully personalized rankings of different anomaly classes that reflect a user's interests, allowing the algorithm to learn to ignore statistically significant but uninteresting outliers (e.g., noise). This should prove useful in the era of massive astronomical datasets serving diverse sets of users who can only review a tiny subset of the incoming data.

Learning to Detect Interesting Anomalies

TL;DR

Abstract

Paper Structure (12 sections, 2 equations, 7 figures, 3 tables)

This paper contains 12 sections, 2 equations, 7 figures, 3 tables.

Introduction
Overview of AHUNT
Comparison Algorithms and Ablation Tests
Datasets and Metrics
Evaluation Metrics
Results
Dynamic vs Static Features
Active Learning vs Random Question Selection
AHUNT vs Anomaly Detection Algorithms
Growing Class Taxonomy & Changing Interests
Conclusions and Future Work
Choice of Loss Function

Figures (7)

Figure 1: Illustration of the AHUNT dynamic latent (feature) space evolution for MNIST over 30 rounds of Active Learning using UMAP dimensionality reductions of the latent space down to two dimensions. In round 1 (left) the anomalies (dark blue) are degenerate with the normal classes (beige and light blue). By round 15 a large separate cluster of anomalies has formed and by round 30 the anomalies are completely separated from the beige cluster. Note that this comes at the cost of dispersing the clusters of normal data, but that is acceptable since our goal is not to classify the normal data.
Figure 2: Random samples from the three datasets we explore: MNIST (top row), CIFAR-10 (middle row) and DESI (bottom row). The first two columns show examples from the two normal classes we chose while the last column shows an example from the anomaly class. For more information see Table \ref{['table_data']}.
Figure 3: The performance of AHUNT on the three datasets MNIST, CIFAR10 and DESI in comparison to active learning with static latent space features (Static AL). In all cases AHUNT significantly outperforms the static version of the algorithm in which there is no evolution of the latent space from one round of active learning to the next. In the static case only the weights of the final layer connecting the output probabilities to the (static) latent space are updated. The first row shows the MCC score while the second row of figures shows the fraction of new anomalies correctly identified as anomalies. For MNIST AHUNT rapidly learns to classify all anomalies while for CIFAR10 and DESI the improvement is slower though significantly better than the static case. The CIFAR10 results are dominated by the rarity of the anomalies in this case (only two per round); see Table \ref{['table_data']}. Error contours are 68% regions computed from 50 randomised runs at each round. To compare the performance of AHUNT against all algorithms see Fig. \ref{['fig:all_three_datasets_ALL_Algs']}.
Figure 4: This figure shows how the algorithm can adapt to changing priorities as well as a changing class taxonomy. The Oracle initially has maximum interest on the reserve class until the first anomaly is found. Then interest is split (0.83, 0.17) between the reserve class and the newly discovered (1st) anomaly class. As soon as the second anomaly class is discovered the user splits their interest (0.125, 0.75, 0.125) over the reserve, 1st anomaly class, 2nd anomaly class and reserve class respectively. The top panel shows total MCC score while the bottom panel shows the number of allowed questions for each class, where the total number of questions is 10. We see that the MCC for the 2nd anomaly class grows very rapidly consistent with the high user interest in that class.
Figure 5: Effect of the three different Oracle question selection strategies on the performance of AHUNT on the three datasets MNIST, CIFAR10 and DESI. The three question strategies considered are (1) sending the most uncertain data to the Oracle, (2) sending the most anomalous data to the Oracle and (3) sending random data to the Oracle. We see that none of the strategies is optimal in all cases and for all rounds. Of the three, random performed consistently worse that both the Uncertainty and Anomalous strategies, showing that active learning provides a significant performance boost, especially when anomalies are rare relative to the normal classes each round. In the case of DESI, anomalies were a much larger percentage ($\sim 13\%$ of each round of data) than in the other cases, narrowing the advantage provided by the active learning.
...and 2 more figures

Learning to Detect Interesting Anomalies

TL;DR

Abstract

Learning to Detect Interesting Anomalies

Authors

TL;DR

Abstract

Table of Contents

Figures (7)