Explicit Orthogonal Arrays and Universal Hashing with Arbitrary Parameters
Nicholas Harvey, Arvin Sahami
TL;DR
The paper provides explicit, deterministic constructions of near-optimal orthogonal arrays for arbitrary parameters by leveraging algebraic geometry codes, achieving sizes $s$ close to Rao’s lower bound; Reed-Solomon instantiations offer practical bounds, while AG-codes yield near-optimal performance across all regimes with polynomial-time construction. It also establishes a tight connection between orthogonal arrays and $t$-independent hash families, delivering efficient representations, evaluation, and construction times (GRH-based improvements). The work unifies coding-theory techniques with pseudorandomness objectives, delivering concrete OA-based hash families that work over general alphabets and offering randomized alternatives with near-optimal sizes. These results advance deterministic, scalable combinatorial designs and hashing schemes applicable to a broad range of pseudorandomness and algorithmic tasks.
Abstract
Orthogonal arrays are a type of combinatorial design that were developed in the 1940s in the design of statistical experiments. In 1947, Rao proved a lower bound on the size of any orthogonal array, and raised the problem of constructing arrays of minimum size. Kuperberg, Lovett and Peled (2017) gave a non-constructive existence proof of orthogonal arrays whose size is near-optimal (i.e., within a polynomial of Rao's lower bound), leaving open the question of an algorithmic construction. We give the first explicit, deterministic, algorithmic construction of orthogonal arrays achieving near-optimal size for all parameters. Our construction uses algebraic geometry codes. In pseudorandomness, the notions of $t$-independent generators or $t$-independent hash functions are equivalent to orthogonal arrays. Classical constructions of $t$-independent hash functions are known when the size of the codomain is a prime power, but very few constructions are known for an arbitrary codomain. Our construction yields algorithmically efficient $t$-independent hash functions for arbitrary domain and codomain.
