Table of Contents
Fetching ...

Approximating the Total Variation Distance between Gaussians

Arnab Bhattacharyya, Weiming Feng, Piyush Srivastava

TL;DR

This work provides an algorithm to approximate the total variation distance between two multivariate Gaussians with relative error $\varepsilon$ in time polynomial in $n$, $1/\varepsilon$, and $\log(1/D)$, where $D$ is the TV distance. The authors introduce an extension-distance framework that generalizes prior discrete-distribution analyses to continuous product measures, enabling a reduction from Gaussian TV distance to a discretized product-distribution problem. The core pipeline combines a linear-algebraic reduction to independent (1D) Gaussian components, a careful discretization of each dimension, and existing algorithms for discrete product TV distance to achieve $(1\pm\varepsilon)$ accuracy with provable error bounds. They discuss both deterministic and randomized running time results and address bit-complexity considerations, including approximate diagonalization in realistic computation models. The results advance the algorithmic understanding of TV distance in continuous settings and point to future work on broader distribution families and alternative distances.

Abstract

The total variation distance is a metric of central importance in statistics and probability theory. However, somewhat surprisingly, questions about computing it algorithmically appear not to have been systematically studied until very recently. In this paper, we contribute to this line of work by studying this question in the important special case of multivariate Gaussians. More formally, we consider the problem of approximating the total variation distance between two multivariate Gaussians to within an $ε$-relative error. Previous works achieved a fixed constant relative error approximation via closed-form formulas. In this work, we give algorithms that given any two $n$-dimensional Gaussians $D_1,D_2$, and any error bound $ε> 0$, approximate the total variation distance $D := d_{TV}(D_1,D_2)$ to $ε$-relative accuracy in $\text{poly}(n,\frac{1}ε,\log \frac{1}{D})$ operations. The main technical tool in our work is a reduction that helps us extend the recent progress on computing the TV-distance between discrete random variables to our continuous setting.

Approximating the Total Variation Distance between Gaussians

TL;DR

This work provides an algorithm to approximate the total variation distance between two multivariate Gaussians with relative error in time polynomial in , , and , where is the TV distance. The authors introduce an extension-distance framework that generalizes prior discrete-distribution analyses to continuous product measures, enabling a reduction from Gaussian TV distance to a discretized product-distribution problem. The core pipeline combines a linear-algebraic reduction to independent (1D) Gaussian components, a careful discretization of each dimension, and existing algorithms for discrete product TV distance to achieve accuracy with provable error bounds. They discuss both deterministic and randomized running time results and address bit-complexity considerations, including approximate diagonalization in realistic computation models. The results advance the algorithmic understanding of TV distance in continuous settings and point to future work on broader distribution families and alternative distances.

Abstract

The total variation distance is a metric of central importance in statistics and probability theory. However, somewhat surprisingly, questions about computing it algorithmically appear not to have been systematically studied until very recently. In this paper, we contribute to this line of work by studying this question in the important special case of multivariate Gaussians. More formally, we consider the problem of approximating the total variation distance between two multivariate Gaussians to within an -relative error. Previous works achieved a fixed constant relative error approximation via closed-form formulas. In this work, we give algorithms that given any two -dimensional Gaussians , and any error bound , approximate the total variation distance to -relative accuracy in operations. The main technical tool in our work is a reduction that helps us extend the recent progress on computing the TV-distance between discrete random variables to our continuous setting.

Paper Structure

This paper contains 32 sections, 19 theorems, 59 equations, 1 algorithm.

Key Result

Theorem 1.2

There exists a deterministic algorithm that solves MultGaussianTV using at most $O\left(\frac{n^3}{\epsilon^2}\log^2\frac{n}{\epsilon D}\left(\log \frac{n}{\epsilon} + \log \log \frac{3}{D}\right) + \frac{n^2}{\epsilon} \log^3 \frac{n}{\epsilon D}\right)$ arithmetic operations along with diagonaliza

Theorems & Definitions (44)

  • Theorem 1.2
  • Theorem 1.3
  • Proposition 2.1
  • Proposition 2.3
  • Theorem 2.5: FengLL24
  • Theorem 2.6: FGJW23
  • Definition 3.1: Radon-Nikodym derivative Dur19
  • Remark 3.2
  • Definition 3.3: Valid ratio and independent products
  • Definition 3.5: The TV functional
  • ...and 34 more