The Sample Complexity of Uniform Approximation for Multi-Dimensional CDFs and Fixed-Price Mechanisms

Matteo Castiglioni; Anna Lunghi; Alberto Marchesi

The Sample Complexity of Uniform Approximation for Multi-Dimensional CDFs and Fixed-Price Mechanisms

Matteo Castiglioni, Anna Lunghi, Alberto Marchesi

TL;DR

This work addresses learning a uniform $\epsilon$-approximation to an unknown $n$-dimensional CDF on $[0,1]^n$ under one-bit bandit feedback. The authors introduce a grid-based approach that achieves a near-dimension-invariant sample complexity of $\frac{1}{\epsilon^3}\log(1/\epsilon)^{\mathcal{O}(n)}$ queries, with the dimension appearing only in polylogarithmic factors through a carefully constructed representative set of hyperrectangles. Key techniques include Monte Carlo estimation of hyperrectangle probabilities, adaptive binary subdivision, and a logarithmically-sized representative family of intervals, enabling efficient reconstruction of the CDF on a fine grid. The results yield tight-ish bounds for learning fixed-price mechanisms in small markets and translate into regret guarantees for online settings, with open questions on tightening lower bounds and regret rates. Overall, the paper demonstrates that multidimensional CDF learning under minimal feedback can achieve one-dimensional-like sample complexity when restricted to grid-based estimation and leverages the CDF’s inherent sparsity to overcome the curse of dimensionality in this context.

Abstract

We study the sample complexity of learning a uniform approximation of an $n$-dimensional cumulative distribution function (CDF) within an error $ε> 0$, when observations are restricted to a minimal one-bit feedback. This serves as a counterpart to the multivariate DKW inequality under ''full feedback'', extending it to the setting of ''bandit feedback''. Our main result shows a near-dimensional-invariance in the sample complexity: we get a uniform $ε$-approximation with a sample complexity $\frac{1}{ε^3}{\log\left(\frac 1 ε\right)^{\mathcal{O}(n)}}$ over a arbitrary fine grid, where the dimensionality $n$ only affects logarithmic terms. As direct corollaries, we provide tight sample complexity bounds and novel regret guarantees for learning fixed-price mechanisms in small markets, such as bilateral trade settings.

The Sample Complexity of Uniform Approximation for Multi-Dimensional CDFs and Fixed-Price Mechanisms

TL;DR

This work addresses learning a uniform

-approximation to an unknown

-dimensional CDF on

under one-bit bandit feedback. The authors introduce a grid-based approach that achieves a near-dimension-invariant sample complexity of

queries, with the dimension appearing only in polylogarithmic factors through a carefully constructed representative set of hyperrectangles. Key techniques include Monte Carlo estimation of hyperrectangle probabilities, adaptive binary subdivision, and a logarithmically-sized representative family of intervals, enabling efficient reconstruction of the CDF on a fine grid. The results yield tight-ish bounds for learning fixed-price mechanisms in small markets and translate into regret guarantees for online settings, with open questions on tightening lower bounds and regret rates. Overall, the paper demonstrates that multidimensional CDF learning under minimal feedback can achieve one-dimensional-like sample complexity when restricted to grid-based estimation and leverages the CDF’s inherent sparsity to overcome the curse of dimensionality in this context.

Abstract

We study the sample complexity of learning a uniform approximation of an

-dimensional cumulative distribution function (CDF) within an error

, when observations are restricted to a minimal one-bit feedback. This serves as a counterpart to the multivariate DKW inequality under ''full feedback'', extending it to the setting of ''bandit feedback''. Our main result shows a near-dimensional-invariance in the sample complexity: we get a uniform

-approximation with a sample complexity

over a arbitrary fine grid, where the dimensionality

only affects logarithmic terms. As direct corollaries, we provide tight sample complexity bounds and novel regret guarantees for learning fixed-price mechanisms in small markets, such as bilateral trade settings.

Paper Structure (48 sections, 15 theorems, 70 equations, 4 figures, 5 algorithms)

This paper contains 48 sections, 15 theorems, 70 equations, 4 figures, 5 algorithms.

Introduction
Relation with the Multivariate DKW Inequality
Overview of the Results
The Sample Complexity of Multi-Dimensional CDFs Over a Grid
The Sample Complexity of Fixed-Price Mechanisms
Regret Minimization for Fixed-Price Mechanisms
Challenges and Techniques
Our Approach
Estimating the Probability of a Hyperrectangle
Binary Subdivision
Building a Representative Set of Intervals
Representative Hyperrectangles Identification
Estimating the Cumulative Probability
Preliminaries
Learning Multi-Dimensional CDFs
...and 33 more sections

Key Result

Theorem 1

Assume that the distribution $\mathcal{D}$ admits a probability density function upper bounded by $\sigma > 0$. Then, there exists an algorithm that, given an accuracy $\epsilon > 0$ and a confidence $\delta \in (0,1)$, uses $\frac{1}{\epsilon^3}\log \left( 1/(\sigma\epsilon\delta) \right)^{\mathcal

Figures (4)

Figure 1: Illustration of the uncertainty region associated with $\mathbb{P}_{X \sim \mathcal{D}}(X\le x)$.
Figure 2: Visual representation of how different representative hyperrectangles are used to compose an estimate in two dimensions ($n=2$). The left side of the picture shows the partitions that define the representative family of intervals along the horizontal dimension. For each of these partitions, a further partitioning along the second dimension is constructed. The right side of the picture illustrates the hyperrectangle composition used to estimate the CDF at the red point.
Figure 3: On the left, the hierarchical construction of the representative family of intervals $\mathcal{I}^\star$, built from the partition $\mathcal{I}=\{I_i\}_{i=1}^8$. The base $\mathcal{I}_0$ (orange) is the original partition. Higher levels $\mathcal{I}_1$ (yellow), $\mathcal{I}_2$ (blue), and $\mathcal{I}_3$ (violet) merge consecutive intervals, forming the full family $\mathcal{I}^\star$ (grey outline). On the right, the binary representation of the intervals in the family. Specifically, each $\mathcal{I}_\ell$ groups together intervals whose binary encoding coincides in their $n-\ell$ most significant bits.
Figure 4: Graphical representation of the representative family $\mathcal{I}^\star$ and its use in expressing a generic interval $[0,x]$. In this example, $[0,x] = \sum_{k=1}^{13} I_k$. Since $13$ has binary representation $1101$, the union of the first $13$ intervals can be expressed as the union of one interval from each $\mathcal{I}_\ell$ corresponding to a $1$ in the $(\ell{+}1)$-th least significant bit. In this case, the ones occur in positions $1$, $3$, and $4$, meaning that the representation within $\mathcal{I}^\star$ includes one element from each of $\mathcal{I}_0$, $\mathcal{I}_2$, and $\mathcal{I}_3$.

Theorems & Definitions (26)

Theorem : \ref{['theo: mainSmooth']}
Theorem : \ref{['theo: main']}
Theorem : \ref{['thm:pricing']}
Theorem : \ref{['theo: mainFixedPriceRegret']}
Remark 2.1
Theorem 3.1
Corollary 3.2
proof
Remark 4.1
Lemma 4.1
...and 16 more

The Sample Complexity of Uniform Approximation for Multi-Dimensional CDFs and Fixed-Price Mechanisms

TL;DR

Abstract

The Sample Complexity of Uniform Approximation for Multi-Dimensional CDFs and Fixed-Price Mechanisms

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (26)