Signature Isolation Forest
Marta Campi, Guillaume Staerman, Gareth W. Peters, Tomoko Matsui
TL;DR
The paper targets functional anomaly detection and the sensitivity of existing FIF to representation choices. It introduces two algorithms, K-SIF and SIF, that embed functional paths via rough path signatures and either use a signature kernel or coordinate signatures to perform nonlinear, dictionary-free splits within an isolation forest framework. Through parameter sweeps, swap-order tests, and real-data benchmarks, the authors show that K-SIF often surpasses FIF and that SIF achieves state-of-the-art performance with robustness and computational efficiency. The work provides practical, data-driven tools for reliable, scalable functional anomaly detection across diverse datasets.
Abstract
Functional Isolation Forest (FIF) is a recent state-of-the-art Anomaly Detection (AD) algorithm designed for functional data. It relies on a tree partition procedure where an abnormality score is computed by projecting each curve observation on a drawn dictionary through a linear inner product. Such linear inner product and the dictionary are a priori choices that highly influence the algorithm's performances and might lead to unreliable results, particularly with complex datasets. This work addresses these challenges by introducing \textit{Signature Isolation Forest}, a novel AD algorithm class leveraging the rough path theory's signature transform. Our objective is to remove the constraints imposed by FIF through the proposition of two algorithms which specifically target the linearity of the FIF inner product and the choice of the dictionary. We provide several numerical experiments, including a real-world applications benchmark showing the relevance of our methods.
