Table of Contents
Fetching ...

Transfer Neyman-Pearson Algorithm for Outlier Detection

Mohammadreza M. Kalan, Eitan J. Neugut, Samory Kpotufe

TL;DR

This paper tackles transfer learning for outlier detection under severe class imbalance by developing a transfer Neyman-Pearson framework with surrogate losses. It introduces an implementable TLNP algorithm built on a constrained optimization of a Lagrangian that jointly leverages source and target data, and provides finite-sample generalization bounds tied to the transfer exponent $\rho(r)$ and the hypothesis class complexity. Theoretical results (Theorems 1–2) characterize how target excess error scales with $n_S$, $n_T$, and source-target relatedness, while empirical results on climate, NASA, financial, and synthetic data demonstrate that TLNP can exploit informative sources and avoid negative transfer when sources are unrelated. The approach is model-free, adaptable to neural networks or quadratic models, and offers practical benefits for imbalanced, domain-transferable detection tasks with real-world relevance such as heavy rainfall prediction and credit delinquency risk.

Abstract

We consider the problem of transfer learning in outlier detection where target abnormal data is rare. While transfer learning has been considered extensively in traditional balanced classification, the problem of transfer in outlier detection and more generally in imbalanced classification settings has received less attention. We propose a general meta-algorithm which is shown theoretically to yield strong guarantees w.r.t. to a range of changes in abnormal distribution, and at the same time amenable to practical implementation. We then investigate different instantiations of this general meta-algorithm, e.g., based on multi-layer neural networks, and show empirically that they outperform natural extensions of transfer methods for traditional balanced classification settings (which are the only solutions available at the moment).

Transfer Neyman-Pearson Algorithm for Outlier Detection

TL;DR

This paper tackles transfer learning for outlier detection under severe class imbalance by developing a transfer Neyman-Pearson framework with surrogate losses. It introduces an implementable TLNP algorithm built on a constrained optimization of a Lagrangian that jointly leverages source and target data, and provides finite-sample generalization bounds tied to the transfer exponent and the hypothesis class complexity. Theoretical results (Theorems 1–2) characterize how target excess error scales with , , and source-target relatedness, while empirical results on climate, NASA, financial, and synthetic data demonstrate that TLNP can exploit informative sources and avoid negative transfer when sources are unrelated. The approach is model-free, adaptable to neural networks or quadratic models, and offers practical benefits for imbalanced, domain-transferable detection tasks with real-world relevance such as heavy rainfall prediction and credit delinquency risk.

Abstract

We consider the problem of transfer learning in outlier detection where target abnormal data is rare. While transfer learning has been considered extensively in traditional balanced classification, the problem of transfer in outlier detection and more generally in imbalanced classification settings has received less attention. We propose a general meta-algorithm which is shown theoretically to yield strong guarantees w.r.t. to a range of changes in abnormal distribution, and at the same time amenable to practical implementation. We then investigate different instantiations of this general meta-algorithm, e.g., based on multi-layer neural networks, and show empirically that they outperform natural extensions of transfer methods for traditional balanced classification settings (which are the only solutions available at the moment).
Paper Structure (20 sections, 3 theorems, 32 equations, 9 figures, 1 table)

This paper contains 20 sections, 3 theorems, 32 equations, 9 figures, 1 table.

Key Result

Proposition 1

Let $\delta>0$ and $\mathcal{H}$ be a hypothesis class satisfying Assumption assumption_rademacher. Furthermore, suppose that $\hat{R}_{\varphi, \mu}$ denotes empirical error with respect to $n$ i.i.d. samples drawn from a distribution $\mu$, which could be either $\mu_0$ or $\mu_1$. Then, with prob where $C$ is defined in Definition surrogate_loss.

Figures (9)

  • Figure 1: Clusters of locations for rain precipitation data yu2024climsim, used as source-target pairs. In one scenario, $(26, 27)$ forms a source-target pair, while in another scenario, $38$ is the target, and $37$ and $39$ grouped together constitute the source.
  • Figure 2: The performance of our algorithm (TLNP), along with other approaches on the Climate data yu2024climsim, is evaluated for a Type-I error rate of $\alpha=0.05$. In this experiment, one scenario fixes the number of target heavy rain samples at $n_T=50$ while increasing the number of source heavy rain samples $n_S$. In the other scenario, $n_S$ is fixed at $2500$, and $n_T$ is varied. In both cases, the target non-heavy rain class contains 4000 training samples.
  • Figure 3: The performance of our algorithm (TLNP), along with other approaches on the Climate data yu2024climsim, is evaluated for a Type-I error rate of $\alpha=0.05$. In this experiment, one scenario fixes the number of target heavy rain samples at $n_T=50$ while increasing the number of source heavy rain samples $n_S$. In the other scenario, $n_S$ is fixed at $2500$, and $n_T$ is varied. In both cases, the target non-heavy rain class contains 4000 training samples.
  • Figure 4: The performance of our algorithm (TLNP), along with other approaches on the Climate data yu2024climsim, is evaluated for a Type-I error rate of $\alpha=0.05$. In this experiment, one scenario fixes the number of target heavy rain samples at $n_T=50$ while increasing the number of source heavy rain samples $n_S$. In the other scenario, $n_S$ is fixed at $2500$, and $n_T$ is varied. In both cases, the target non-heavy rain class contains 4000 training samples.
  • Figure 5: The performance of our algorithm (TLNP), along with other approaches, on financial data github_credit for predicting whether a person will become financially delinquent. The threshold on Type-I error is set at $\alpha=0.1$. In this experiment, the number of source samples is fixed at $n_S=2500$, and $n_T$ is varied from 25 to 250. Moreover, the target normal class contains 4000 training samples.
  • ...and 4 more figures

Theorems & Definitions (10)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4: Rademacher Complexity bartlett2002rademacher
  • Remark 1
  • Definition 5: Transfer Exponent
  • Proposition 1
  • Theorem 1
  • Remark 2
  • Theorem 2