Table of Contents
Fetching ...

Ratio Covers of Convex Sets and Optimal Mixture Density Estimation

Spencer Compton, Gábor Lugosi, Jaouad Mourtada, Jian Qian, Nikita Zhivotovskiy

TL;DR

An optimal ratio covering theorem for convex sets is proved, which yields new cardinality estimates for $\varepsilon$-approximate Pareto sets in multi-objective optimization when the attainable set of objective vectors is convex.

Abstract

We study density estimation in Kullback-Leibler divergence: given an i.i.d. sample from an unknown density $p$, the goal is to construct an estimator $\widehat p$ such that $\mathrm{KL}(p,\widehat p)$ is small with high probability. We consider two settings involving a finite dictionary of $M$ densities: (i) model aggregation, where $p$ belongs to the dictionary, and (ii) convex aggregation (mixture density estimation), where $p$ is a mixture of densities from the dictionary. Crucially, we make no assumption on the base densities: their ratios may be unbounded and their supports may differ. For both problems, we identify the best possible high-probability guarantees in terms of the dictionary size, sample size, and confidence level. These optimal rates are higher than those achievable when density ratios are bounded by absolute constants; for mixture density estimation, they match existing lower bounds in the special case of discrete distributions. Our analysis of the mixture case hinges on two new covering results. First, we provide a sharp, distribution-free upper bound on the local Hellinger entropy of the class of mixtures of $M$ distributions. Second, we prove an optimal ratio covering theorem for convex sets: for every convex compact set $K\subset \mathbb{R}_+^d$, there exists a subset $A\subset K$ with at most $2^{8d}$ elements such that each element of $K$ is coordinate-wise dominated by an element of $A$ up to a universal constant factor. This geometric result is of independent interest; notably, it yields new cardinality estimates for $\varepsilon$-approximate Pareto sets in multi-objective optimization when the attainable set of objective vectors is convex.

Ratio Covers of Convex Sets and Optimal Mixture Density Estimation

TL;DR

An optimal ratio covering theorem for convex sets is proved, which yields new cardinality estimates for -approximate Pareto sets in multi-objective optimization when the attainable set of objective vectors is convex.

Abstract

We study density estimation in Kullback-Leibler divergence: given an i.i.d. sample from an unknown density , the goal is to construct an estimator such that is small with high probability. We consider two settings involving a finite dictionary of densities: (i) model aggregation, where belongs to the dictionary, and (ii) convex aggregation (mixture density estimation), where is a mixture of densities from the dictionary. Crucially, we make no assumption on the base densities: their ratios may be unbounded and their supports may differ. For both problems, we identify the best possible high-probability guarantees in terms of the dictionary size, sample size, and confidence level. These optimal rates are higher than those achievable when density ratios are bounded by absolute constants; for mixture density estimation, they match existing lower bounds in the special case of discrete distributions. Our analysis of the mixture case hinges on two new covering results. First, we provide a sharp, distribution-free upper bound on the local Hellinger entropy of the class of mixtures of distributions. Second, we prove an optimal ratio covering theorem for convex sets: for every convex compact set , there exists a subset with at most elements such that each element of is coordinate-wise dominated by an element of up to a universal constant factor. This geometric result is of independent interest; notably, it yields new cardinality estimates for -approximate Pareto sets in multi-objective optimization when the attainable set of objective vectors is convex.
Paper Structure (40 sections, 30 theorems, 211 equations, 1 table)

This paper contains 40 sections, 30 theorems, 211 equations, 1 table.

Key Result

Theorem 1

For every $d \geqslant 1$ and every convex and compact set $K \subset \mathbb R_+^d$, there exists a subset $A \subset K$ with at most $2^{8d}$ elements that is a $32$-ratio cover of $K$.

Theorems & Definitions (49)

  • Definition 1
  • Theorem 1
  • Lemma 1
  • Lemma 2: Model aggregation with bounded density ratios
  • Theorem 2
  • Lemma 3
  • proof
  • Theorem 3
  • proof
  • Definition 2
  • ...and 39 more