Table of Contents
Fetching ...

Robust and Sparse Estimation of Unbounded Density Ratio under Heavy Contamination

Ryosuke Nagumo, Hironori Fujisawa

TL;DR

The paper addresses robust density ratio estimation under heavy contamination, proposing a non-asymptotic analysis of Weighted DRE that permits unbounded density ratios as long as the weighted ratio r(x)w(x) is bounded. It introduces a comprehensive set of assumptions on the weight function, robustness under contamination, and sparse estimation, and proves a main theorem giving sparse consistency and a concrete error bound that separates naive statistical error, weight-induced variability, and contamination effects. The results show that Weighted DRE can outperform conventional DRE in contaminated settings and remain effective for unbounded density ratios when the weighting properly bounds the weighted ratio. Numerical experiments corroborate the theory, demonstrating strong resilience to outliers and preservation of sparse recovery across dimensional regimes, with practical implications for change detection, outlier handling, and covariate-shift scenarios.

Abstract

We examine the non-asymptotic properties of robust density ratio estimation (DRE) in contaminated settings. Weighted DRE is the most promising among existing methods, exhibiting doubly strong robustness from an asymptotic perspective. This study demonstrates that Weighted DRE achieves sparse consistency even under heavy contamination within a non-asymptotic framework. This method addresses two significant challenges in density ratio estimation and robust estimation. For density ratio estimation, we provide the non-asymptotic properties of estimating unbounded density ratios under the assumption that the weighted density ratio function is bounded. For robust estimation, we introduce a non-asymptotic framework for doubly strong robustness under heavy contamination, assuming that at least one of the following conditions holds: (i) contamination ratios are small, and (ii) outliers have small weighted values. This work provides the first non-asymptotic analysis of strong robustness under heavy contamination.

Robust and Sparse Estimation of Unbounded Density Ratio under Heavy Contamination

TL;DR

The paper addresses robust density ratio estimation under heavy contamination, proposing a non-asymptotic analysis of Weighted DRE that permits unbounded density ratios as long as the weighted ratio r(x)w(x) is bounded. It introduces a comprehensive set of assumptions on the weight function, robustness under contamination, and sparse estimation, and proves a main theorem giving sparse consistency and a concrete error bound that separates naive statistical error, weight-induced variability, and contamination effects. The results show that Weighted DRE can outperform conventional DRE in contaminated settings and remain effective for unbounded density ratios when the weighting properly bounds the weighted ratio. Numerical experiments corroborate the theory, demonstrating strong resilience to outliers and preservation of sparse recovery across dimensional regimes, with practical implications for change detection, outlier handling, and covariate-shift scenarios.

Abstract

We examine the non-asymptotic properties of robust density ratio estimation (DRE) in contaminated settings. Weighted DRE is the most promising among existing methods, exhibiting doubly strong robustness from an asymptotic perspective. This study demonstrates that Weighted DRE achieves sparse consistency even under heavy contamination within a non-asymptotic framework. This method addresses two significant challenges in density ratio estimation and robust estimation. For density ratio estimation, we provide the non-asymptotic properties of estimating unbounded density ratios under the assumption that the weighted density ratio function is bounded. For robust estimation, we introduce a non-asymptotic framework for doubly strong robustness under heavy contamination, assuming that at least one of the following conditions holds: (i) contamination ratios are small, and (ii) outliers have small weighted values. This work provides the first non-asymptotic analysis of strong robustness under heavy contamination.

Paper Structure

This paper contains 55 sections, 27 theorems, 231 equations, 3 figures.

Key Result

Theorem 3.5

We assume that $n_{p,q}^* \ge N_{\delta}$ holds, where $N_{\delta}$ is a positive constant. Under Assumptions assumption weight for normal, assumption-9, assumption-8, and assumption weight for outlier, we have for any $\bm{\theta}\in\Theta$ and $t \in \mathcal{E}$ with probability at least $1-2\delta$, where $\delta$ is a small positive constant.

Figures (3)

  • Figure 1: The success probability in the estimation of the active set of DRE and Weighted DRE in the clean and contaminated settings. The x-axis shows the dataset sizes and each line corresponds to the different dimension size $m$.
  • Figure 2: The success probability in the estimation of the active set of the bounded and unbounded density ratio by DRE and Weighted DRE. The x-axis shows the dataset sizes and each line corresponds to the different dimension size $m$.
  • Figure 3: The probability density functions of Gaussian distributions with different precisions and the weight function.

Theorems & Definitions (45)

  • Theorem 3.5
  • Proposition 3.7
  • Proposition 3.9
  • Theorem 3.11
  • Corollary 3.12
  • Lemma 4.1
  • Lemma 4.2
  • Lemma 4.3
  • Lemma 4.4
  • Proposition A.1
  • ...and 35 more