The Phase Transition of Discrepancy in Random Hypergraphs
Calum MacRury, Tomáš Masařík, Leilani Pai, Xavier Pérez-Giménez
TL;DR
The paper analyzes discrepancy in two random hypergraph models—edge-independent $\mathbb{H}(n,m,p)$ and edge-dependent $\mathcal{H}(n,m,d)$—and shows sharp lower bounds that reveal a phase-transition behavior as the edge count $m$ varies relative to the number of vertices $n$. Using probabilistic methods and Berry–Esseen-type bounds, it establishes w.h.p. lower bounds of the form $\mathrm{disc}(H)=\Omega\big(2^{-n/m}\sqrt{pn}\big)$ for $m=O(n)$ and $\Omega\big(\sqrt{pn\log\gamma}\big)$ for $m\gg n$ in the edge-independent model, with parallel results for the edge-dependent model in terms of $\tfrac{dn}{m}$. In the dense regime, the authors provide nearly matching, algorithmic upper bounds via Lovett–Meka's partial colouring lemma, showing $\mathrm{disc}(H)=O\big(\sqrt{\tfrac{dn}{m}}\log(m/n)\big)$ (and $\Theta\big(\sqrt{\tfrac{dn}{m}\log(m/n)}\big)$ in the optimized parameter range). Together, these results characterize how discrepancy transitions from $\Theta(\sqrt{d})$ to $o(\sqrt{d})$ as $m$ grows from $\Theta(n)$ to $m\gg n$, and they connect to the Beck–Fiala conjecture in a probabilistic setting. The work also discusses open problems in the sparse regime and provides algorithmic colouring guarantees in the dense regime, highlighting both theoretical and practical implications for discrepancy in random structures.
Abstract
Motivated by the Beck-Fiala conjecture, we study the discrepancy problem in two related models of random hypergraphs on $n$ vertices and $m$ edges. In the first (edge-independent) model, a random hypergraph $H_1$ is constructed by fixing a parameter $p$ and allowing each of the $n$ vertices to join each of the $m$ edges independently with probability $p$. In the parameter range in which $pn \rightarrow \infty$ and $pm \rightarrow \infty$, we show that with high probability (w.h.p.) $H_1$ has discrepancy at least $Ω(2^{-n/m} \sqrt{pn})$ when $m = O(n)$, and at least $Ω(\sqrt{pn \logγ})$ when $m \gg n$, where $γ= \min\{ m/n, pn\}$. In the second (edge-dependent) model, $d$ is fixed and each vertex of $H_2$ independently joins exactly $d$ edges uniformly at random. We obtain analogous results for this model by generalizing the techniques used for the edge-independent model with $p=d/m$. Namely, for $d \rightarrow \infty$ and $dn/m \rightarrow \infty$, we prove that w.h.p. $H_{2}$ has discrepancy at least $Ω(2^{-n/m} \sqrt{dn/m})$ when $m = O(n)$, and at least $Ω(\sqrt{(dn/m) \logγ})$ when $m \gg n$, where $γ=\min\{m/n, dn/m\}$. Furthermore, we obtain nearly matching asymptotic upper bounds on the discrepancy in both models (when $p=d/m$), in the dense regime of $m \gg n$. Specifically, we apply the partial colouring lemma of Lovett and Meka to show that w.h.p. $H_{1}$ and $H_{2}$ each have discrepancy $O( \sqrt{dn/m} \log(m/n))$, provided $d \rightarrow \infty$, $d n/m \rightarrow \infty$ and $m \gg n$. This result is algorithmic, and together with the work of Bansal and Meka characterizes how the discrepancy of each random hypergraph model transitions from $Θ(\sqrt{d})$ to $o(\sqrt{d})$ as $m$ varies from $m=Θ(n)$ to $m \gg n$.
