Table of Contents
Fetching ...

Unsupervised Symbolic Anomaly Detection

Md Maruf Hossain, Tim Katzke, Simon Klüttermann, Emmanuel Müller

Abstract

We propose SYRAN, an unsupervised anomaly detection method based on symbolic regression. Instead of encoding normal patterns in an opaque, high-dimensional model, our method learns an ensemble of human-readable equations that describe symbolic invariants: functions that are approximately constant on normal data. Deviations from these invariants yield anomaly scores, so that the detection logic is interpretable by construction, rather than via post-hoc explanation. Experimental results demonstrate that SYRAN is highly interpretable, providing equations that correspond to known scientific or medical relationships, and maintains strong anomaly detection performance comparable to that of state-of-the-art methods.

Unsupervised Symbolic Anomaly Detection

Abstract

We propose SYRAN, an unsupervised anomaly detection method based on symbolic regression. Instead of encoding normal patterns in an opaque, high-dimensional model, our method learns an ensemble of human-readable equations that describe symbolic invariants: functions that are approximately constant on normal data. Deviations from these invariants yield anomaly scores, so that the detection logic is interpretable by construction, rather than via post-hoc explanation. Experimental results demonstrate that SYRAN is highly interpretable, providing equations that correspond to known scientific or medical relationships, and maintains strong anomaly detection performance comparable to that of state-of-the-art methods.
Paper Structure (23 sections, 11 equations, 3 figures, 1 table, 1 algorithm)

This paper contains 23 sections, 11 equations, 3 figures, 1 table, 1 algorithm.

Figures (3)

  • Figure 1: Box plots and critical difference diagrams comparing the AUC-ROC performance of the SYRAN ensemble (a,b) and its best-performing ensemble member (a,c) against baseline anomaly detection methods.
  • Figure 2: Symbolic invariants learned by SYRAN on three ADBench datasets. Significant deviations of function values from 1 are indicative of anomalous behavior.
  • Figure 3: Mean AUC-ROC performance across benchmark datasets for SYRAN and its best ensemble member across different hyperparameterizations compared to LOF as its strongest competitor. The default configuration is highlighted in gray.