Table of Contents
Fetching ...

Rethinking Unsupervised Outlier Detection via Multiple Thresholding

Zhonghang Liu, Panzhong Lu, Guoyang Xie, Zhichao Lu, Wen-Yan Lin

TL;DR

A multiple thresholding (Multi-T) module is proposed that generates two thresholds that isolate inliers and outliers from the unlabelled target dataset, whereas outliers are employed to obtain better feature representation while inliers provide an uncontaminated manifold.

Abstract

In the realm of unsupervised image outlier detection, assigning outlier scores holds greater significance than its subsequent task: thresholding for predicting labels. This is because determining the optimal threshold on non-separable outlier score functions is an ill-posed problem. However, the lack of predicted labels not only hiders some real applications of current outlier detectors but also causes these methods not to be enhanced by leveraging the dataset's self-supervision. To advance existing scoring methods, we propose a multiple thresholding (Multi-T) module. It generates two thresholds that isolate inliers and outliers from the unlabelled target dataset, whereas outliers are employed to obtain better feature representation while inliers provide an uncontaminated manifold. Extensive experiments verify that Multi-T can significantly improve proposed outlier scoring methods. Moreover, Multi-T contributes to a naive distance-based method being state-of-the-art.

Rethinking Unsupervised Outlier Detection via Multiple Thresholding

TL;DR

A multiple thresholding (Multi-T) module is proposed that generates two thresholds that isolate inliers and outliers from the unlabelled target dataset, whereas outliers are employed to obtain better feature representation while inliers provide an uncontaminated manifold.

Abstract

In the realm of unsupervised image outlier detection, assigning outlier scores holds greater significance than its subsequent task: thresholding for predicting labels. This is because determining the optimal threshold on non-separable outlier score functions is an ill-posed problem. However, the lack of predicted labels not only hiders some real applications of current outlier detectors but also causes these methods not to be enhanced by leveraging the dataset's self-supervision. To advance existing scoring methods, we propose a multiple thresholding (Multi-T) module. It generates two thresholds that isolate inliers and outliers from the unlabelled target dataset, whereas outliers are employed to obtain better feature representation while inliers provide an uncontaminated manifold. Extensive experiments verify that Multi-T can significantly improve proposed outlier scoring methods. Moreover, Multi-T contributes to a naive distance-based method being state-of-the-art.
Paper Structure (17 sections, 23 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 17 sections, 23 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: Unlike the default UOD paradigm (a) concerned about learning a discriminative score function, our perspective (b) is to explore the dataset's prior knowledge via thresholding, to advance the previously proposed method. (c) illustrates the significant improvement of DeepSVDD ruff2018deep with Multi-T module.
  • Figure 2: The overall paradigm of adopting the multi-thresholds learning (Multi-T) module to advance the existing outlier scoring methods. (a) The preparation consists of feature extraction and an initial outlier score function. (b) Visualization of our Multi-T module. (c) Integrating the predicted inliers and outliers with the previously proposed outlier detectors and obtaining an enhanced outlier score function.
  • Figure 3: Qualitative results compared with SOTA threshold learners. The experiments are conducted on STL-10 (Inlier class: Monkey).
  • Figure 4: Efficiency comparison for outlier scoring. Timing is measured with 10,000 samples, GPU: NVIDIA RTX 3080.
  • Figure 5: Efficiency comparison for thresholding. All thresholding methods are conducted on CPU.
  • ...and 3 more figures