Neyman-Pearson Classification under Both Null and Alternative Distributions Shift
Mohammadreza M. Kalan, Yuyang Deng, Eitan J. Neugut, Samory Kpotufe
TL;DR
This work addresses transfer learning for Neyman–Pearson classification under potential shifts in both the source and target class‑conditionals. It proposes a two‑stage adaptive procedure that aligns the source constraint with the target and then refines the model using source class‑1 data, ensuring the target Type‑I constraint while reducing the target Type‑II error when the source is informative and avoiding negative transfer otherwise. Theoretical guarantees are provided via transfer moduli $\phi_{1}^{S\to T}$ and $\phi_{0}^{S\to T}$ that bound the excess risk, alongside a computational framework based on two staged convex programs solved with SGDA, yielding polynomial‑time assurance. Empirical results on climate datasets demonstrate adaptive gains when the source is helpful and robust performance when it is not, underscoring the method's practical value for imbalanced NP transfer learning.
Abstract
We consider the problem of transfer learning in Neyman-Pearson classification, where the objective is to minimize the error w.r.t. a distribution $μ_1$, subject to the constraint that the error w.r.t. a distribution $μ_0$ remains below a prescribed threshold. While transfer learning has been extensively studied in traditional classification, transfer learning in imbalanced classification such as Neyman-Pearson classification has received much less attention. This setting poses unique challenges, as both types of errors must be simultaneously controlled. Existing works address only the case of distribution shift in $μ_1$, whereas in many practical scenarios shifts may occur in both $μ_0$ and $μ_1$. We derive an adaptive procedure that not only guarantees improved Type-I and Type-II errors when the source is informative, but also automatically adapt to situations where the source is uninformative, thereby avoiding negative transfer. In addition to such statistical guarantees, the procedures is efficient, as shown via complementary computational guarantees.
