Table of Contents
Fetching ...

Bypassing the Noisy Parity Barrier: Learning Higher-Order Markov Random Fields from Dynamics

Jason Gaitonde, Ankur Moitra, Elchanan Mossel

TL;DR

It is surprisingly demonstrated that the direct trajectory data from a natural evolution of the MRF overcomes the fundamental computational lower bounds to efficient learning and leads to efficiency far beyond what is known and believed to be true in the traditional i.i.i.d. case.

Abstract

We consider the problem of learning graphical models, also known as Markov random fields (MRFs) from temporally correlated samples. As in many traditional statistical settings, fundamental results in the area all assume independent samples from the distribution. However, these samples generally will not directly correspond to more realistic observations from nature, which instead evolve according to some stochastic process. From the computational lens, even generating a single sample from the true MRF distribution is intractable unless $\mathsf{NP}=\mathsf{RP}$, and moreover, any algorithm to learn from i.i.d. samples requires prohibitive runtime due to hardness reductions to the parity with noise problem. These computational barriers for sampling and learning from the i.i.d. setting severely lessen the utility of these breakthrough results for this important task; however, dropping this assumption typically only introduces further algorithmic and statistical complexities. In this work, we surprisingly demonstrate that the direct trajectory data from a natural evolution of the MRF overcomes the fundamental computational lower bounds to efficient learning. In particular, we show that given a trajectory with $\widetilde{O}_k(n)$ site updates of an order $k$ MRF from the Glauber dynamics, a well-studied, natural stochastic process on graphical models, there is an algorithm that recovers the graph and the parameters in $\widetilde{O}_k(n^2)$ time. By contrast, all prior algorithms for learning order $k$ MRFs inherently suffer from $n^{Θ(k)}$ runtime even in sparse instances due to the reductions to sparse parity with noise. Our results thus surprisingly show that this more realistic, but intuitively less tractable, model for MRFs actually leads to efficiency far beyond what is known and believed to be true in the traditional i.i.d. case.

Bypassing the Noisy Parity Barrier: Learning Higher-Order Markov Random Fields from Dynamics

TL;DR

It is surprisingly demonstrated that the direct trajectory data from a natural evolution of the MRF overcomes the fundamental computational lower bounds to efficient learning and leads to efficiency far beyond what is known and believed to be true in the traditional i.i.i.d. case.

Abstract

We consider the problem of learning graphical models, also known as Markov random fields (MRFs) from temporally correlated samples. As in many traditional statistical settings, fundamental results in the area all assume independent samples from the distribution. However, these samples generally will not directly correspond to more realistic observations from nature, which instead evolve according to some stochastic process. From the computational lens, even generating a single sample from the true MRF distribution is intractable unless , and moreover, any algorithm to learn from i.i.d. samples requires prohibitive runtime due to hardness reductions to the parity with noise problem. These computational barriers for sampling and learning from the i.i.d. setting severely lessen the utility of these breakthrough results for this important task; however, dropping this assumption typically only introduces further algorithmic and statistical complexities. In this work, we surprisingly demonstrate that the direct trajectory data from a natural evolution of the MRF overcomes the fundamental computational lower bounds to efficient learning. In particular, we show that given a trajectory with site updates of an order MRF from the Glauber dynamics, a well-studied, natural stochastic process on graphical models, there is an algorithm that recovers the graph and the parameters in time. By contrast, all prior algorithms for learning order MRFs inherently suffer from runtime even in sparse instances due to the reductions to sparse parity with noise. Our results thus surprisingly show that this more realistic, but intuitively less tractable, model for MRFs actually leads to efficiency far beyond what is known and believed to be true in the traditional i.i.d. case.
Paper Structure (27 sections, 26 theorems, 93 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 27 sections, 26 theorems, 93 equations, 3 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1.1

Let $\mu=\mu_{\psi}$ be a $(k,d,\alpha,\lambda)$-MRF. Then there exists an algorithm that, given $O_{k,d,\alpha,\lambda}(n\log n)$ total site updates of Glauber dynamics, outputs the dependency graph $G$ of $\mu$ with high probability. The runtime is $O_{k,d,\alpha,\lambda}(n^2\log n)$.

Figures (3)

  • Figure 1: We consider the performance of \ref{['alg:markov_blanket']} compared to Sparsitron on random SPN instances (left), as well as demonstrate that \ref{['alg:markov_blanket']} succeeds on two parity instances (right).
  • Figure 2: We consider the performance of \ref{['alg:markov_blanket']} (left) compared to Sparsitron (right) on random SPN instances, where each run was stopped after 1500 seconds if it did not stably achieve approximate neighborhood recovery. The interquartile boxes plot the middle 50% of runtimes, the dotted circles denote the median times, and the red circles denote outlier runtimes. While Sparsitron performs slightly quicker for $k=2$ and small values of $n$, the runtime quickly blows up due to the $\Theta(n^k)$ time and space complexity.
  • Figure 3: We consider the performance of \ref{['alg:markov_blanket']} on instances with two opposing parities. The algorithm succeeded in all instances, typically quite quickly.

Theorems & Definitions (59)

  • Theorem 1.1: \ref{['thm:structure_recovery']}, informal
  • Theorem 1.2: \ref{['thm:lr_final']}, informal
  • Lemma 3.1
  • proof
  • Definition 3.4: Unbiased Distributions
  • Definition 3.5: Santha-Vazirani Source DBLP:conf/focs/SanthaV84
  • Lemma 3.6: Lemma 6.1 of DBLP:conf/focs/KlivansM17
  • Lemma 3.7
  • Lemma 4.1
  • proof
  • ...and 49 more