A simple polynomial-time approximation algorithm for the total variation distance between two product distributions
Weiming Feng, Heng Guo, Mark Jerrum, Jiaheng Wang
TL;DR
This work provides a polynomial-time Monte Carlo method to approximate the total variation distance $d_{\mathrm{TV}}(P,Q)$ between two product distributions $P=\bigotimes_{i=1}^n P_i$ and $Q=\bigotimes_{i=1}^n Q_i$, addressing the hardness of exact computation. It leverages a coordinate-wise greedy coupling to define a sampling distribution $\pi$ conditioned on $X\neq Y$ and uses a likelihood-ratio estimator $f(\omega)$ comparing an optimal coupling to the greedy one. The algorithm achieves a relative error within $\pm \varepsilon$ with probability at least $1-\delta$ in time $O\left(\frac{n^2}{\varepsilon^2} \log \frac{1}{\delta}\right)$, and accommodates varying domain sizes per coordinate. The approach generalizes beyond restricted Boolean domains and relies on a median-of-means scheme to ensure reliability. This provides a practical, scalable tool for TV-distance estimation where exact computation is intractable.
Abstract
We give a simple polynomial-time approximation algorithm for the total variation distance between two product distributions.
