Table of Contents
Fetching ...

Nearly Optimal Bounds for Sample-Based Testing and Learning of $k$-Monotone Functions

Hadley Black

TL;DR

The paper resolves key questions in sample-based monotonicity testing and learning by establishing nearly tight exponential bounds under the uniform distribution for hypercube domains and extending to measurable $k$-monotone functions under product measures on $\mathbb{R}^d$. It introduces a general lower-bound framework using Talagrand random DNFs to show testing requires $\exp\big(\Omega(\min\{\frac{rk}{\varepsilon}\sqrt{d},d\})\big)$ samples, while providing matching upper bounds for learning and testing (up to polylog factors in the exponent) and a distinct $\exp(\Theta(d))$ bound for one-sided error testing. For continuous product spaces, it achieves nearly tight bounds $\exp(\widetilde{\Theta}(\min\{\frac{rk}{\varepsilon}\sqrt{d},d\}))$, improving prior $1/\varepsilon^2$-dependent exponents to $1/\varepsilon$ in the regime $r\ge 2$. Central techniques include a downsampling reduction to hypergrids, Fourier concentration and Low-Degree algorithms for learning on grids, and a layered DNFs construction to drive lower bounds. The results close the gap between sample-based and query-based models for these monotonicity classes and provide a foundation for further exploration of $k$-monotone function testing and learning in high dimensions.

Abstract

We study monotonicity testing of functions $f \colon \{0,1\}^d \to \{0,1\}$ using sample-based algorithms, which are only allowed to observe the value of $f$ on points drawn independently from the uniform distribution. A classic result by Bshouty-Tamon (J. ACM 1996) proved that monotone functions can be learned with $\exp(\widetilde{O}(\min\{\frac{1}{\varepsilon}\sqrt{d},d\}))$ samples and it is not hard to show that this bound extends to testing. Prior to our work the only lower bound for this problem was $Ω(\sqrt{\exp(d)/\varepsilon})$ in the small $\varepsilon$ parameter regime, when $\varepsilon = O(d^{-3/2})$, due to Goldreich-Goldwasser-Lehman-Ron-Samorodnitsky (Combinatorica 2000). Thus, the sample complexity of monotonicity testing was wide open for $\varepsilon \gg d^{-3/2}$. We resolve this question, obtaining a nearly tight lower bound of $\exp(Ω(\min\{\frac{1}{\varepsilon}\sqrt{d},d\}))$ for all $\varepsilon$ at most a sufficiently small constant. In fact, we prove a much more general result, showing that the sample complexity of $k$-monotonicity testing and learning for functions $f \colon \{0,1\}^d \to [r]$ is $\exp(Ω(\min\{\frac{rk}{\varepsilon}\sqrt{d},d\}))$. For testing with one-sided error we show that the sample complexity is $\exp(Θ(d))$. Beyond the hypercube, we prove nearly tight bounds (up to polylog factors of $d,k,r,1/\varepsilon$ in the exponent) of $\exp(\widetildeΘ(\min\{\frac{rk}{\varepsilon}\sqrt{d},d\}))$ on the sample complexity of testing and learning measurable $k$-monotone functions $f \colon \mathbb{R}^d \to [r]$ under product distributions. Our upper bound improves upon the previous bound of $\exp(\widetilde{O}(\min\{\frac{k}{\varepsilon^2}\sqrt{d},d\}))$ by Harms-Yoshida (ICALP 2022) for Boolean functions ($r=2$).

Nearly Optimal Bounds for Sample-Based Testing and Learning of $k$-Monotone Functions

TL;DR

The paper resolves key questions in sample-based monotonicity testing and learning by establishing nearly tight exponential bounds under the uniform distribution for hypercube domains and extending to measurable -monotone functions under product measures on . It introduces a general lower-bound framework using Talagrand random DNFs to show testing requires samples, while providing matching upper bounds for learning and testing (up to polylog factors in the exponent) and a distinct bound for one-sided error testing. For continuous product spaces, it achieves nearly tight bounds , improving prior -dependent exponents to in the regime . Central techniques include a downsampling reduction to hypergrids, Fourier concentration and Low-Degree algorithms for learning on grids, and a layered DNFs construction to drive lower bounds. The results close the gap between sample-based and query-based models for these monotonicity classes and provide a foundation for further exploration of -monotone function testing and learning in high dimensions.

Abstract

We study monotonicity testing of functions using sample-based algorithms, which are only allowed to observe the value of on points drawn independently from the uniform distribution. A classic result by Bshouty-Tamon (J. ACM 1996) proved that monotone functions can be learned with samples and it is not hard to show that this bound extends to testing. Prior to our work the only lower bound for this problem was in the small parameter regime, when , due to Goldreich-Goldwasser-Lehman-Ron-Samorodnitsky (Combinatorica 2000). Thus, the sample complexity of monotonicity testing was wide open for . We resolve this question, obtaining a nearly tight lower bound of for all at most a sufficiently small constant. In fact, we prove a much more general result, showing that the sample complexity of -monotonicity testing and learning for functions is . For testing with one-sided error we show that the sample complexity is . Beyond the hypercube, we prove nearly tight bounds (up to polylog factors of in the exponent) of on the sample complexity of testing and learning measurable -monotone functions under product distributions. Our upper bound improves upon the previous bound of by Harms-Yoshida (ICALP 2022) for Boolean functions ().
Paper Structure (32 sections, 19 theorems, 60 equations, 1 figure, 1 algorithm)

This paper contains 32 sections, 19 theorems, 60 equations, 1 figure, 1 algorithm.

Key Result

Theorem 1.1

There is an absolute constant $c > 0$ such that for all $\varepsilon \leq c$, every sample-based $k$-monotonicity tester for functions $f\colon \{0,1\}^d \to [r]$ under the uniform distribution has sample complexity

Figures (1)

  • Figure 1: An illustration of the construction used in our proof of \ref{['thm:LB_2sided_samples_rk']}. The image represents the set of points in the hypercube $\{0,1\}^d$ with Hamming weight in the interval $[\frac{d}{2},\frac{d}{2}+\varepsilon\sqrt{d})$, increasing from bottom to top. The numbers on the left denote the Hamming weight of the points lying in the adjacent horizontal line. The $B_i$ blocks are the sets of points contained between two adjacent horizontal lines. Each orange shaded region within $B_i$ represents the set of points satisfied by a term $t^{i,j}$. The blue numbers represent the value that functions in the support of $\mathcal{D}_{\texttt{yes}}$ and $\mathcal{D}_{\texttt{no}}$ can take. We have used the notation "$r-1,2$" as shorthand for $r-2,r-1$.

Theorems & Definitions (52)

  • Theorem 1.1: Testing Lower Bound
  • Corollary 1.2: Learning Lower Bound
  • Theorem 1.3: Learning Upper Bound for Hypercubes
  • Corollary 1.4: Testing Upper Bound for Hypercubes
  • Theorem 1.5: Testing with One-Sided Error
  • Theorem 1.6: Learning Upper Bound for Product Spaces
  • Corollary 1.7: Testing Upper Bound for Product Spaces
  • proof : Proof of \ref{['thm:UB-hypercube']}
  • Definition 2.1
  • Definition 2.2: $k$-monotonicity
  • ...and 42 more