Table of Contents
Fetching ...

Feature Explosion: a generic optimization strategy for outlier detection algorithms

Qi Li

TL;DR

The paper tackles the problem of excessive specialization and redundancy among outlier detection algorithms by introducing OSD, a generic physics-inspired optimization that can be plugged into diverse detectors. OSD operates by partitioning data into object-blocks via a pruned $k$-NN graph, detonating a virtual bomb to push low-mass blocks (potential outliers) away from high-mass blocks (normals), and applying a repulsion stage to prevent mixing, yielding a transformed dataset $\widehat{X}$ that improves detection accuracy across principles. Empirically, OSD yields average improvements of $15\%$ in AUC and $63.7\%$ in AP across 14 detectors on 24 datasets, and ablation studies confirm the necessity of the explosion center and repulsion components. The work demonstrates that separating the optimization from the detection principle enables broad applicability and practical gains, reducing the need to design bespoke optimized variants for each algorithm. It also provides robustness analyses across inflection-point thresholds and density-imbalanced data, suggesting substantial real-world impact for enhancing outlier detection pipelines.

Abstract

Outlier detection tasks aim at discovering potential issues or opportunities and are widely used in cybersecurity, financial security, industrial inspection, etc. To date, thousands of outlier detection algorithms have been proposed. Clearly, in real-world scenarios, such a large number of algorithms is unnecessary. In other words, a large number of outlier detection algorithms are redundant. We believe the root cause of this redundancy lies in the current highly customized (i.e., non-generic) optimization strategies. Specifically, when researchers seek to improve the performance of existing outlier detection algorithms, they have to design separate optimized versions tailored to the principles of each algorithm, leading to an ever-growing number of outlier detection algorithms. To address this issue, in this paper, we introduce the explosion from physics into the outlier detection task and propose a generic optimization strategy based on feature explosion, called OSD (Optimization Strategy for outlier Detection algorithms). In the future, when improving the performance of existing outlier detection algorithms, it will be sufficient to invoke the OSD plugin without the need to design customized optimized versions for them. We compared the performances of 14 outlier detection algorithms on 24 datasets before and after invoking the OSD plugin. The experimental results show that the performances of all outlier detection algorithms are improved on almost all datasets. In terms of average accuracy, OSD make these outlier detection algorithms improve by 15% (AUC), 63.7% (AP).

Feature Explosion: a generic optimization strategy for outlier detection algorithms

TL;DR

The paper tackles the problem of excessive specialization and redundancy among outlier detection algorithms by introducing OSD, a generic physics-inspired optimization that can be plugged into diverse detectors. OSD operates by partitioning data into object-blocks via a pruned -NN graph, detonating a virtual bomb to push low-mass blocks (potential outliers) away from high-mass blocks (normals), and applying a repulsion stage to prevent mixing, yielding a transformed dataset that improves detection accuracy across principles. Empirically, OSD yields average improvements of in AUC and in AP across 14 detectors on 24 datasets, and ablation studies confirm the necessity of the explosion center and repulsion components. The work demonstrates that separating the optimization from the detection principle enables broad applicability and practical gains, reducing the need to design bespoke optimized variants for each algorithm. It also provides robustness analyses across inflection-point thresholds and density-imbalanced data, suggesting substantial real-world impact for enhancing outlier detection pipelines.

Abstract

Outlier detection tasks aim at discovering potential issues or opportunities and are widely used in cybersecurity, financial security, industrial inspection, etc. To date, thousands of outlier detection algorithms have been proposed. Clearly, in real-world scenarios, such a large number of algorithms is unnecessary. In other words, a large number of outlier detection algorithms are redundant. We believe the root cause of this redundancy lies in the current highly customized (i.e., non-generic) optimization strategies. Specifically, when researchers seek to improve the performance of existing outlier detection algorithms, they have to design separate optimized versions tailored to the principles of each algorithm, leading to an ever-growing number of outlier detection algorithms. To address this issue, in this paper, we introduce the explosion from physics into the outlier detection task and propose a generic optimization strategy based on feature explosion, called OSD (Optimization Strategy for outlier Detection algorithms). In the future, when improving the performance of existing outlier detection algorithms, it will be sufficient to invoke the OSD plugin without the need to design customized optimized versions for them. We compared the performances of 14 outlier detection algorithms on 24 datasets before and after invoking the OSD plugin. The experimental results show that the performances of all outlier detection algorithms are improved on almost all datasets. In terms of average accuracy, OSD make these outlier detection algorithms improve by 15% (AUC), 63.7% (AP).

Paper Structure

This paper contains 18 sections, 8 equations, 9 figures, 5 tables, 3 algorithms.

Figures (9)

  • Figure 1: Feature explosion.
  • Figure 2: An example about outliers and normal objects.
  • Figure 3: The object-block division.
  • Figure 4: An example of virtual bomb and explosion shock force.
  • Figure 5: Geometric proof.
  • ...and 4 more figures

Theorems & Definitions (6)

  • proof
  • proof
  • proof
  • proof
  • proof
  • proof