Computing Low-Entropy Couplings for Large-Support Distributions
Samuel Sokota, Dylan Sam, Christian Schroeder de Witt, Spencer Compton, Jakob Foerster, J. Zico Kolter
TL;DR
This work tackles the minimum-entropy coupling (MEC) problem for marginals with large supports, where exact MEC is intractable. It unifies iterative MEC methods under a partition-based formalism and introduces ARIMEC, the first IMEC variant capable of handling arbitrary discrete distributions, by leveraging an autoregressive prefix-tree partition set. To address brittleness to partition choices, it adds merging, which groups posterior updates to reduce entropy waste and improve robustness. Empirical results in Markov coding games and steganography demonstrate that ARIMEC, especially with merging, achieves higher throughput and more reliable decoding than prior approaches, enabling practical low-entropy couplings for large-scale applications. The work also provides efficient implementation strategies and a codebase for broader adoption in high-throughput, large-support MEC tasks.
Abstract
Minimum-entropy coupling (MEC) -- the process of finding a joint distribution with minimum entropy for given marginals -- has applications in areas such as causality and steganography. However, existing algorithms are either computationally intractable for large-support distributions or limited to specific distribution types and sensitive to hyperparameter choices. This work addresses these limitations by unifying a prior family of iterative MEC (IMEC) approaches into a generalized partition-based formalism. From this framework, we derive a novel IMEC algorithm called ARIMEC, capable of handling arbitrary discrete distributions, and introduce a method to make IMEC robust to suboptimal hyperparameter settings. These innovations facilitate the application of IMEC to high-throughput steganography with language models, among other settings. Our codebase is available at https://github.com/ssokota/mec .
