Estimating Ising Models in Total Variation Distance
Constantinos Daskalakis, Vardis Kandiros, Rui Yao
TL;DR
This work provides a unified, polynomial-time framework for learning Ising models in TV distance using the Maximum Pseudo-Likelihood Estimator (MPLE). It analyzes two broad model classes—those satisfying MLSI with bounded operator norm, and bounded-width models—establishing concentration/anti-concentration tools and a Frobenius-norm to TV bridge to obtain TV guarantees from finite samples. The results deliver near-optimal to optimal sample complexities across spectrally-bounded, SK, diluted-SK, and antiferromagnetic-expander regimes, and sharpen guarantees under a regularity condition that translates Frobenius accuracy into TV accuracy. By connecting single-sample estimation ideas to multi-sample learning via MLSI-based concentration and Hubbard–Stratonovich decompositions, the paper advances a unified framework for TV learning in high-dimensional MRFs with complex dependencies.
Abstract
We consider the problem of estimating Ising models over $n$ variables in Total Variation (TV) distance, given $l$ independent samples from the model. While the statistical complexity of the problem is well-understood [DMR20], identifying computationally and statistically efficient algorithms has been challenging. In particular, remarkable progress has occurred in several settings, such as when the underlying graph is a tree [DP21, BGPV21], when the entries of the interaction matrix follow a Gaussian distribution [GM24, CK24], or when the bulk of its eigenvalues lie in a small interval [AJK+24, KLV24], but no unified framework for polynomial-time estimation in TV exists so far. Our main contribution is a unified analysis of the Maximum Pseudo-Likelihood Estimator (MPLE) for two general classes of Ising models. The first class includes models that have bounded operator norm and satisfy the Modified Log-Sobolev Inequality (MLSI), a functional inequality that was introduced to study the convergence of the associated Glauber dynamics to stationarity. In the second class of models, the interaction matrix has bounded infinity norm (or bounded width), which is the most common assumption in the literature for structure learning of Ising models. We show how our general results for these classes yield polynomial-time algorithms and optimal or near-optimal sample complexity guarantees in a variety of settings. Our proofs employ a variety of tools from tensorization inequalities to measure decompositions and concentration bounds.
