Table of Contents
Fetching ...

Boundary Peeling: Outlier Detection Method Using One-Class Peeling

Sheikh Arafat, Na Sun, Maria L. Weese, Waldyn G. Martinez

TL;DR

In synthetic data simulations One-Class Boundary Peeling outperforms all state of the art methods when no outliers are present while maintaining comparable or superior performance in the presence of outliers, as compared to benchmark methods.

Abstract

Unsupervised outlier detection constitutes a crucial phase within data analysis and remains a dynamic realm of research. A good outlier detection algorithm should be computationally efficient, robust to tuning parameter selection, and perform consistently well across diverse underlying data distributions. We introduce One-Class Boundary Peeling, an unsupervised outlier detection algorithm. One-class Boundary Peeling uses the average signed distance from iteratively-peeled, flexible boundaries generated by one-class support vector machines. One-class Boundary Peeling has robust hyperparameter settings and, for increased flexibility, can be cast as an ensemble method. In synthetic data simulations One-Class Boundary Peeling outperforms all state of the art methods when no outliers are present while maintaining comparable or superior performance in the presence of outliers, as compared to benchmark methods. One-Class Boundary Peeling performs competitively in terms of correct classification, AUC, and processing time using common benchmark data sets.

Boundary Peeling: Outlier Detection Method Using One-Class Peeling

TL;DR

In synthetic data simulations One-Class Boundary Peeling outperforms all state of the art methods when no outliers are present while maintaining comparable or superior performance in the presence of outliers, as compared to benchmark methods.

Abstract

Unsupervised outlier detection constitutes a crucial phase within data analysis and remains a dynamic realm of research. A good outlier detection algorithm should be computationally efficient, robust to tuning parameter selection, and perform consistently well across diverse underlying data distributions. We introduce One-Class Boundary Peeling, an unsupervised outlier detection algorithm. One-class Boundary Peeling uses the average signed distance from iteratively-peeled, flexible boundaries generated by one-class support vector machines. One-class Boundary Peeling has robust hyperparameter settings and, for increased flexibility, can be cast as an ensemble method. In synthetic data simulations One-Class Boundary Peeling outperforms all state of the art methods when no outliers are present while maintaining comparable or superior performance in the presence of outliers, as compared to benchmark methods. One-Class Boundary Peeling performs competitively in terms of correct classification, AUC, and processing time using common benchmark data sets.
Paper Structure (6 sections, 2 equations, 2 figures, 8 tables, 2 algorithms)

This paper contains 6 sections, 2 equations, 2 figures, 8 tables, 2 algorithms.

Figures (2)

  • Figure 1: BP Example on a 2-dimensional unimodal data set. There are 100 inlier observations generated by $t(df = 5)$, and 10 outlier observations generated from $U(-10,10)$. Contours indicate kernel signed distances from separating hyperplane. Blue contours indicate positive distances (darker blue indicates distance closer to zero), while red indicate negative distances. Support vectors are marked with an X.
  • Figure 2: BP Example on a 2-dimensional bimodal data set. 50 inlier observations generated from $N(-3,1)$ and 50 inlier observations generated from $N(3,1)$. 20 outlier observations generated from $U(-10,10)$. Contours indicate kernel signed distances from separating hyperplane. Blue contours indicate positive distances (darker blue indicates distance closer to zero), while red indicate negative distances. Support vectors are marked with an X.