Adaptive learning of density ratios in RKHS

Werner Zellinger; Stefan Kindermann; Sergei V. Pereverzyev

Adaptive learning of density ratios in RKHS

Werner Zellinger, Stefan Kindermann, Sergei V. Pereverzyev

TL;DR

The paper addresses estimating the density ratio $\beta=\mathrm dP/\mathrm dQ$ from finite samples by learning a density-ratio model in an RKHS through a regularized Bregman-divergence objective. It establishes that the associated losses are generalized self-concordant, derives finite-sample error bounds that depend on a regularity parameter $r$ and a capacity parameter $\alpha$, and proves minimax-optimal rates for the square loss. A novel Lepski-type principle is introduced to adaptively select the regularization parameter $\lambda$ without knowing $r$, achieving the optimal rate in the square-loss case and near-optimal rates in general. The framework is validated by a numerical example and supported by a suite of detailed proofs for both a priori and empirical norm-based parameter choices, offering a practical, theory-backed method for density-ratio estimation in RKHSs.

Abstract

Estimating the ratio of two probability densities from finitely many observations of the densities is a central problem in machine learning and statistics with applications in two-sample testing, divergence estimation, generative modeling, covariate shift adaptation, conditional density estimation, and novelty detection. In this work, we analyze a large class of density ratio estimation methods that minimize a regularized Bregman divergence between the true density ratio and a model in a reproducing kernel Hilbert space (RKHS). We derive new finite-sample error bounds, and we propose a Lepskii type parameter choice principle that minimizes the bounds without knowledge of the regularity of the density ratio. In the special case of quadratic loss, our method adaptively achieves a minimax optimal error rate. A numerical illustration is provided.

Adaptive learning of density ratios in RKHS

TL;DR

The paper addresses estimating the density ratio

from finite samples by learning a density-ratio model in an RKHS through a regularized Bregman-divergence objective. It establishes that the associated losses are generalized self-concordant, derives finite-sample error bounds that depend on a regularity parameter

and a capacity parameter

, and proves minimax-optimal rates for the square loss. A novel Lepski-type principle is introduced to adaptively select the regularization parameter

without knowing

, achieving the optimal rate in the square-loss case and near-optimal rates in general. The framework is validated by a numerical example and supported by a suite of detailed proofs for both a priori and empirical norm-based parameter choices, offering a practical, theory-backed method for density-ratio estimation in RKHSs.

Abstract

Paper Structure (21 sections, 12 theorems, 90 equations, 2 figures)

This paper contains 21 sections, 12 theorems, 90 equations, 2 figures.

Introduction
Problem
Results
Related Work
Notation and Auxiliaries
Estimation of Bregman Divergence
Learning in RKHS with Convex Losses
Error Rates under Regularity and Capacity
Slow Error Rate with A Priori Parameter Choice
Fast Error Rate Requiring A Posteriori Parameter Choice
Balancing Principle using Approximation by Norm
Parameter Choice when the Norm is Known
Parameter Choice with Estimated Norm
Numerical Example
Verification of Details
...and 6 more sections

Key Result

Lemma 1

Let the Bregman generator $F$ and the density ratio model $g(f)$ be defined by for some link function $\Psi$ and the Bayes risk $G$ of a loss $\ell$ satisfying Assumption ass:strictly_proper_composit_and_diff_bayes_risk. Then

Figures (2)

Figure 1: The goal is to learn the density ratio $\beta=\frac{\mathop{}\!\mathrm{d} P}{\mathop{}\!\mathrm{d} Q}$ (black, solid).
Figure 2: Results for approximating the density ratio $\beta$ (black, dashed) by the KuLSIF (green, solid) and Exp method (blue, solid). Rows: sample sizes $m=n\in\{3,10,100\}$. Columns: Parameter choices $\lambda=10^i, i\in\{-3,\ldots,2\}$. Choices of the proposed Lepskii type principle are marked by boxes in the upper left.

Theorems & Definitions (18)

Example 1
Lemma 1: menon2016linking, Proposition 3
Example 2
Lemma 2: marteau2019beyond, Theorem 3
Lemma 3: marteau2019beyond, Theorem 38
Example 3
Proposition 1
Remark 4
Proposition 2
Remark 5
...and 8 more

Adaptive learning of density ratios in RKHS

TL;DR

Abstract

Adaptive learning of density ratios in RKHS

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (18)