Table of Contents
Fetching ...

Adaptive learning of density ratios in RKHS

Werner Zellinger, Stefan Kindermann, Sergei V. Pereverzyev

TL;DR

The paper addresses estimating the density ratio $\beta=\mathrm dP/\mathrm dQ$ from finite samples by learning a density-ratio model in an RKHS through a regularized Bregman-divergence objective. It establishes that the associated losses are generalized self-concordant, derives finite-sample error bounds that depend on a regularity parameter $r$ and a capacity parameter $\alpha$, and proves minimax-optimal rates for the square loss. A novel Lepski-type principle is introduced to adaptively select the regularization parameter $\lambda$ without knowing $r$, achieving the optimal rate in the square-loss case and near-optimal rates in general. The framework is validated by a numerical example and supported by a suite of detailed proofs for both a priori and empirical norm-based parameter choices, offering a practical, theory-backed method for density-ratio estimation in RKHSs.

Abstract

Estimating the ratio of two probability densities from finitely many observations of the densities is a central problem in machine learning and statistics with applications in two-sample testing, divergence estimation, generative modeling, covariate shift adaptation, conditional density estimation, and novelty detection. In this work, we analyze a large class of density ratio estimation methods that minimize a regularized Bregman divergence between the true density ratio and a model in a reproducing kernel Hilbert space (RKHS). We derive new finite-sample error bounds, and we propose a Lepskii type parameter choice principle that minimizes the bounds without knowledge of the regularity of the density ratio. In the special case of quadratic loss, our method adaptively achieves a minimax optimal error rate. A numerical illustration is provided.

Adaptive learning of density ratios in RKHS

TL;DR

The paper addresses estimating the density ratio from finite samples by learning a density-ratio model in an RKHS through a regularized Bregman-divergence objective. It establishes that the associated losses are generalized self-concordant, derives finite-sample error bounds that depend on a regularity parameter and a capacity parameter , and proves minimax-optimal rates for the square loss. A novel Lepski-type principle is introduced to adaptively select the regularization parameter without knowing , achieving the optimal rate in the square-loss case and near-optimal rates in general. The framework is validated by a numerical example and supported by a suite of detailed proofs for both a priori and empirical norm-based parameter choices, offering a practical, theory-backed method for density-ratio estimation in RKHSs.

Abstract

Estimating the ratio of two probability densities from finitely many observations of the densities is a central problem in machine learning and statistics with applications in two-sample testing, divergence estimation, generative modeling, covariate shift adaptation, conditional density estimation, and novelty detection. In this work, we analyze a large class of density ratio estimation methods that minimize a regularized Bregman divergence between the true density ratio and a model in a reproducing kernel Hilbert space (RKHS). We derive new finite-sample error bounds, and we propose a Lepskii type parameter choice principle that minimizes the bounds without knowledge of the regularity of the density ratio. In the special case of quadratic loss, our method adaptively achieves a minimax optimal error rate. A numerical illustration is provided.
Paper Structure (21 sections, 12 theorems, 90 equations, 2 figures)

This paper contains 21 sections, 12 theorems, 90 equations, 2 figures.

Key Result

Lemma 1

Let the Bregman generator $F$ and the density ratio model $g(f)$ be defined by for some link function $\Psi$ and the Bayes risk $G$ of a loss $\ell$ satisfying Assumption ass:strictly_proper_composit_and_diff_bayes_risk. Then

Figures (2)

  • Figure 1: The goal is to learn the density ratio $\beta=\frac{\mathop{}\!\mathrm{d} P}{\mathop{}\!\mathrm{d} Q}$ (black, solid).
  • Figure 2: Results for approximating the density ratio $\beta$ (black, dashed) by the KuLSIF (green, solid) and Exp method (blue, solid). Rows: sample sizes $m=n\in\{3,10,100\}$. Columns: Parameter choices $\lambda=10^i, i\in\{-3,\ldots,2\}$. Choices of the proposed Lepskii type principle are marked by boxes in the upper left.

Theorems & Definitions (18)

  • Example 1
  • Lemma 1: menon2016linking, Proposition 3
  • Example 2
  • Lemma 2: marteau2019beyond, Theorem 3
  • Lemma 3: marteau2019beyond, Theorem 38
  • Example 3
  • Proposition 1
  • Remark 4
  • Proposition 2
  • Remark 5
  • ...and 8 more