Sequential Outlier Hypothesis Testing under Universality Constraints
Jun Diao, Lin Zhou
TL;DR
This work analyzes sequential outlier hypothesis testing when both nominal and anomalous distributions are unknown, introducing universal error-exponent and stopping-time guarantees. It derives tight large-deviation bounds for exact one-outlier and extends to multiple-outlier settings, establishing GJS-based exponents for error probability universality and Rényi-based exponents for expected-stopping-time universality. The results show sequential tests can outperform fixed-length tests in both misclassification and Bayesian exponents, while quantifying penalties when the number of outliers is unknown. The analysis leverages the method of types to provide rigorous bounds and offers practical insights for universal anomaly detection on finite alphabets, with avenues for non-asymptotic and continuous-domain extensions.
Abstract
We revisit sequential outlier hypothesis testing and derive bounds on achievable exponents when both the nominal and anomalous distributions are unknown. The task of outlier hypothesis testing is to identify the set of outliers that are generated from an anomalous distribution among all observed sequences where the rest majority are generated from a nominal distribution. In the sequential setting, one obtains a symbol from each sequence per unit time until a reliable decision could be made. For the case with exactly one outlier, our exponent bounds are tight, providing exact large deviations characterization of sequential tests and strengthening a previous result of Li, Nitinawarat and Veeravalli (2017). In particular, the average sample size of our sequential test is bounded universally under any pair of nominal and anomalous distributions and our sequential test achieves larger Bayesian exponent than the fixed-length test, which could not be guaranteed by the sequential test of Li, Nitinawarat and Veeravalli (2017). For the case with at most one outlier, we propose a threshold-based test that has bounded expected stopping time under mild conditions and we bound the exponential decay rate of error probabilities under each non-null hypothesis and the null hypothesis. Our sequential test resolves the tradeoff among the exponential decay rates of misclassification, false reject and false alarm probabilities for the fixed-length test of Zhou, Wei and Hero (TIT 2022). Finally, with a further step towards practical applications, we generalize our results to the cases of multiple outliers and show that there is a penalty in the error exponents when the number of outliers is unknown.
