Table of Contents
Fetching ...

Coniferest: a complete active anomaly detection framework

M. V. Kornilov, V. S. Korolev, K. L. Malanchev, A. D. Lavrukhina, E. Russeil, T. A. Semenikhin, E. Gangler, E. E. O. Ishida, M. V. Pruzhinskaya, A. A. Volnova, S. Sreejith

TL;DR

Coniferest, an open source generic purpose active anomaly detection framework written in Python, is presented and a few success cases which resulted from applying the package to real astronomical data in active anomaly detection tasks within the SNAD project are described.

Abstract

We present coniferest, an open source generic purpose active anomaly detection framework written in Python. The package design and implemented algorithms are described. Currently, static outlier detection analysis is supported via the Isolation forest algorithm. Moreover, Active Anomaly Discovery (AAD) and Pineforest algorithms are available to tackle active anomaly detection problems. The algorithms and package performance are evaluated on a series of synthetic datasets. We also describe a few success cases which resulted from applying the package to real astronomical data in active anomaly detection tasks within the SNAD project.

Coniferest: a complete active anomaly detection framework

TL;DR

Coniferest, an open source generic purpose active anomaly detection framework written in Python, is presented and a few success cases which resulted from applying the package to real astronomical data in active anomaly detection tasks within the SNAD project are described.

Abstract

We present coniferest, an open source generic purpose active anomaly detection framework written in Python. The package design and implemented algorithms are described. Currently, static outlier detection analysis is supported via the Isolation forest algorithm. Moreover, Active Anomaly Discovery (AAD) and Pineforest algorithms are available to tackle active anomaly detection problems. The algorithms and package performance are evaluated on a series of synthetic datasets. We also describe a few success cases which resulted from applying the package to real astronomical data in active anomaly detection tasks within the SNAD project.

Paper Structure

This paper contains 12 sections, 8 equations, 1 figure.

Figures (1)

  • Figure 1: Performance comparison between scikit-learn (listed as "sklearn") and coniferest. A dataset containing $\approx 10^6$ samples, each having $2$ features is considered. The dataset anomaly score evaluation has been measured for different versions of scikit-learn. Additionally, coniferest in single-thread mode is considered. Note, that scikit-learn is always single-threaded.