On the Relationship Between Several Variants of the Linear Hashing Conjecture
Alek Westover
TL;DR
This paper investigates the worst-case maxload of Linear Hashing (LH) and several natural variants across prime, composite, and real moduli, focusing on the open question of whether the maxload can be bounded by $n^{o(1)}$. It develops a framework of intermediate questions connecting prime, integer, and real modulus models, and proves partial progress including a transfer of Knudsen’s $\tilde{O}(n^{1/3})$ bound to composite moduli via Smart LH, alongside a strong two-bin result. A key contribution is establishing a surprising equivalence between Real LH and Random Modulus LH, suggesting the subpolynomial maxload behavior may be robust to the choice of modulus; the analysis leverages Farey sequences and careful decomposition of collision structures. Together, these results provide evidence that composite-modulus LH may mirror prime-modulus behavior and illustrate concrete, tractable directions (such as the two-bin case) toward resolving the broader open problem on LH maxload. The work lays a foundation for further reductions and cross-modulus insights with potential implications for hash function design in practice.
Abstract
In Linear Hashing ($\mathsf{LH}$) with $β$ bins on a size $u$ universe ${\mathcal{U}=\{0,1,\ldots, u-1\}}$, items $\{x_1,x_2,\ldots, x_n\}\subset \mathcal{U}$ are placed in bins by the hash function $$x_i\mapsto (ax_i+b)\mod p \mod β$$ for some prime $p\in [u,2u]$ and randomly chosen integers $a,b \in [1,p]$. The "maxload" of $\mathsf{LH}$ is the number of items assigned to the fullest bin. Expected maxload for a worst-case set of items is a natural measure of how well $\mathsf{LH}$ distributes items amongst the bins. Fix $β=n$. Despite $\mathsf{LH}$'s simplicity, bounding $\mathsf{LH}$'s worst-case maxload is extremely challenging. It is well-known that on random inputs $\mathsf{LH}$ achieves maxload $Ω\left(\frac{\log n}{\log\log n}\right)$; this is currently the best lower bound for $\mathsf{LH}$'s expected maxload. Recently Knudsen established an upper bound of $\widetilde{O}(n^{1 / 3})$. The question "Is the worst-case expected maxload of $\mathsf{LH}$ $n^{o(1)}$?" is one of the most basic open problems in discrete math. In this paper we propose a set of intermediate open questions to help researchers make progress on this problem. We establish the relationship between these intermediate open questions and make some partial progress on them.
