Table of Contents
Fetching ...

Product distribution learning with imperfect advice

Arnab Bhattacharyya, Davin Choo, Philips George John, Themis Gouleakis

TL;DR

The paper addresses learning product distributions on $\{0,1\}^d$ with imperfect advisory information. By combining a tolerant mean tester, a block-based analysis, and a constrained mean-estimation (LASSO) step, the authors design a polynomial-time algorithm that achieves $\mathrm{d_{TV}}(P,\widehat{P})\le \varepsilon$ with sublinear in $d$ samples, provided $\|\mathbf{p}-\mathbf{q}\|_1$ is sufficiently small and the distribution is $\tau$-balanced. The main result gives a sample bound of $\tilde{O}\bigl( \frac{d}{\varepsilon^2} \bigl( d^{-\eta} + \min\{1, \frac{\|\mathbf{p}-\mathbf{q}\|_1^2}{d^{1-4\eta}\varepsilon^2}\} \bigr) \bigr)$, highlighting that improved efficiency is achievable when the advice is accurate, while maintaining robustness if the advice is poor. The work also establishes lower bounds showing necessity of the balancedness assumption and limits on sublinear-sample learning when advice is inadequate or the distribution is unbalanced. Overall, this work advances learning with predictions in the discrete, high-dimensional setting and suggests avenues for extending the framework to other complex models.

Abstract

Given i.i.d.~samples from an unknown distribution $P$, the goal of distribution learning is to recover the parameters of a distribution that is close to $P$. When $P$ belongs to the class of product distributions on the Boolean hypercube $\{0,1\}^d$, it is known that $Ω(d/\varepsilon^2)$ samples are necessary to learn $P$ within total variation (TV) distance $\varepsilon$. We revisit this problem when the learner is also given as advice the parameters of a product distribution $Q$. We show that there is an efficient algorithm to learn $P$ within TV distance $\varepsilon$ that has sample complexity $\tilde{O}(d^{1-η}/\varepsilon^2)$, if $\|\mathbf{p} - \mathbf{q}\|_1 < \varepsilon d^{0.5 - Ω(η)}$. Here, $\mathbf{p}$ and $\mathbf{q}$ are the mean vectors of $P$ and $Q$ respectively, and no bound on $\|\mathbf{p} - \mathbf{q}\|_1$ is known to the algorithm a priori.

Product distribution learning with imperfect advice

TL;DR

The paper addresses learning product distributions on with imperfect advisory information. By combining a tolerant mean tester, a block-based analysis, and a constrained mean-estimation (LASSO) step, the authors design a polynomial-time algorithm that achieves with sublinear in samples, provided is sufficiently small and the distribution is -balanced. The main result gives a sample bound of , highlighting that improved efficiency is achievable when the advice is accurate, while maintaining robustness if the advice is poor. The work also establishes lower bounds showing necessity of the balancedness assumption and limits on sublinear-sample learning when advice is inadequate or the distribution is unbalanced. Overall, this work advances learning with predictions in the discrete, high-dimensional setting and suggests avenues for extending the framework to other complex models.

Abstract

Given i.i.d.~samples from an unknown distribution , the goal of distribution learning is to recover the parameters of a distribution that is close to . When belongs to the class of product distributions on the Boolean hypercube , it is known that samples are necessary to learn within total variation (TV) distance . We revisit this problem when the learner is also given as advice the parameters of a product distribution . We show that there is an efficient algorithm to learn within TV distance that has sample complexity , if . Here, and are the mean vectors of and respectively, and no bound on is known to the algorithm a priori.

Paper Structure

This paper contains 9 sections, 9 theorems, 16 equations, 2 algorithms.

Key Result

Proposition 2.3

Suppose $P$ and $Q$ are $\tau$-balanced product distributions on $\{0,1\}^d$ with mean vectors $\mathbf{p}$ and $\mathbf{q}$ respectively. Then their KL divergence $\mathrm{d_{KL}}(P\|Q)$ satisfies:

Theorems & Definitions (18)

  • Definition 2.1: Mean vectors
  • Definition 2.2
  • Proposition 2.3: e.g., canonne2017testing, Lemma 1
  • Proposition 2.4
  • proof
  • Theorem 3.1
  • Lemma 3.2: Tolerant mean tester
  • proof
  • Lemma 3.2
  • proof
  • ...and 8 more