Table of Contents
Fetching ...

Improving Interpretability of Scores in Anomaly Detection Based on Gaussian-Bernoulli Restricted Boltzmann Machine

Kaiji Sekimoto, Muneki Yasuda

TL;DR

This study proposes a measure that improves score's interpretability based on its cumulative distribution, and establishes a guideline for setting the threshold using the interpretable measure and proposes an evaluation method for the minimum score based on simulated annealing, widely used for optimization problems.

Abstract

Gaussian-Bernoulli restricted Boltzmann machines (GBRBMs) are often used for semi-supervised anomaly detection, where they are trained using only normal data points. In GBRBM-based anomaly detection, normal and anomalous data are classified based on a score that is identical to an energy function of the marginal GBRBM. However, the classification threshold is difficult to set to an appropriate value, as this score cannot be interpreted. In this study, we propose a measure that improves score's interpretability based on its cumulative distribution, and establish a guideline for setting the threshold using the interpretable measure. The results of numerical experiments show that the guideline is reasonable when setting the threshold solely using normal data points. Moreover, because identifying the measure involves computationally infeasible evaluation of the minimum score value, we also propose an evaluation method for the minimum score based on simulated annealing, which is widely used for optimization problems. The proposed evaluation method was also validated using numerical experiments.

Improving Interpretability of Scores in Anomaly Detection Based on Gaussian-Bernoulli Restricted Boltzmann Machine

TL;DR

This study proposes a measure that improves score's interpretability based on its cumulative distribution, and establishes a guideline for setting the threshold using the interpretable measure and proposes an evaluation method for the minimum score based on simulated annealing, widely used for optimization problems.

Abstract

Gaussian-Bernoulli restricted Boltzmann machines (GBRBMs) are often used for semi-supervised anomaly detection, where they are trained using only normal data points. In GBRBM-based anomaly detection, normal and anomalous data are classified based on a score that is identical to an energy function of the marginal GBRBM. However, the classification threshold is difficult to set to an appropriate value, as this score cannot be interpreted. In this study, we propose a measure that improves score's interpretability based on its cumulative distribution, and establish a guideline for setting the threshold using the interpretable measure. The results of numerical experiments show that the guideline is reasonable when setting the threshold solely using normal data points. Moreover, because identifying the measure involves computationally infeasible evaluation of the minimum score value, we also propose an evaluation method for the minimum score based on simulated annealing, which is widely used for optimization problems. The proposed evaluation method was also validated using numerical experiments.
Paper Structure (12 sections, 23 equations, 7 figures, 1 table, 1 algorithm)

This paper contains 12 sections, 23 equations, 7 figures, 1 table, 1 algorithm.

Figures (7)

  • Figure 1: Illustration of extended marginal GBRBM in Eq. \ref{['eq:extended_GBRBM']} against various $\beta$. $\beta$ manages the magnitude of the density ratio among data points; when (a) $\beta\rightarrow 0$, (b) $\beta=1$, and (c) $\beta\rightarrow\infty$, the extended marginal GBRBM becomes a uniform distribution, the marginal GBRBM in Eq. \ref{['eq:GBRBM_visible']}, and the delta function $\delta(f_\theta(\bm{v}) - \mathrm{f}^*)$, respectively.
  • Figure 2: Illustration of (a) original GBRBM in Eq. \ref{['eq:GBRBM']} and (b) replicated GBRBM in Eq. \ref{['eq:replicated_GBRBM']}.
  • Figure 3: Random samples from training data of the toy dataset. Black and white pixels indicate values of $-1$ and $+1$, respectively.
  • Figure 4: Estimators of $\mathrm{f}^*$ in Eq. \ref{['eq:Minimum_FE_score']} obtained from previous and proposed methods. The values are the average and standard deviation obtained over 100 experiments.
  • Figure 5: Random samples from training data obtained from MNIST and F-MNIST.
  • ...and 2 more figures