Transfer Neyman-Pearson Algorithm for Outlier Detection
Mohammadreza M. Kalan, Eitan J. Neugut, Samory Kpotufe
TL;DR
This paper tackles transfer learning for outlier detection under severe class imbalance by developing a transfer Neyman-Pearson framework with surrogate losses. It introduces an implementable TLNP algorithm built on a constrained optimization of a Lagrangian that jointly leverages source and target data, and provides finite-sample generalization bounds tied to the transfer exponent $\rho(r)$ and the hypothesis class complexity. Theoretical results (Theorems 1–2) characterize how target excess error scales with $n_S$, $n_T$, and source-target relatedness, while empirical results on climate, NASA, financial, and synthetic data demonstrate that TLNP can exploit informative sources and avoid negative transfer when sources are unrelated. The approach is model-free, adaptable to neural networks or quadratic models, and offers practical benefits for imbalanced, domain-transferable detection tasks with real-world relevance such as heavy rainfall prediction and credit delinquency risk.
Abstract
We consider the problem of transfer learning in outlier detection where target abnormal data is rare. While transfer learning has been considered extensively in traditional balanced classification, the problem of transfer in outlier detection and more generally in imbalanced classification settings has received less attention. We propose a general meta-algorithm which is shown theoretically to yield strong guarantees w.r.t. to a range of changes in abnormal distribution, and at the same time amenable to practical implementation. We then investigate different instantiations of this general meta-algorithm, e.g., based on multi-layer neural networks, and show empirically that they outperform natural extensions of transfer methods for traditional balanced classification settings (which are the only solutions available at the moment).
