Collective Outlier Detection and Enumeration with Conformalized Closed Testing
Chiara G. Magnani, Matteo Sesia, Aldo Solari
TL;DR
The paper tackles collective outlier detection under distribution-free guarantees by introducing ACODE, a framework that uses conformal inference to convert powerful, possibly black-box classifiers into principled conformity scores for global testing and enumeration. It automates the choice of classifier and two-sample testing procedure, integrating closed testing to yield simultaneous lower bounds on the number of outliers in any subset and a global outlier test, with data-driven tuning to maintain validity. The approach combines Shirashi’s locally most powerful rank tests and adaptive testing with exchangeability-based asymptotics, providing finite-sample validity and asymptotic power guarantees under mild assumptions. Empirical demonstrations on synthetic data and the LHCO particle-collision dataset show that ACODE achieves near-oracle performance in global detection and outlier enumeration, while remaining robust to selection bias and applicable to large-scale data. The work offers a practical, distribution-free toolkit for applications in finance, cybersecurity, and physics where collective anomalies are more detectable than individual outliers.
Abstract
This paper develops a flexible distribution-free method for collective outlier detection and enumeration, designed for situations in which the presence of outliers can be detected powerfully even though their precise identification may be challenging due to the sparsity, weakness, or elusiveness of their signals. This method builds upon recent developments in conformal inference and integrates classical ideas from other areas, including multiple testing, locally most powerful and adaptive rank tests, and non-parametric large-sample asymptotics. The key innovation lies in developing a principled and effective approach for automatically choosing the most appropriate machine learning classifier and two-sample testing procedure for a given data set. The performance of our method is investigated through extensive empirical demonstrations, including an analysis of the LHCO high-energy particle collision data set.
