Table of Contents
Fetching ...

Markov bases: a 25 year update

Félix Almendra-Hernández, Jesús A. De Loera, Sonja Petrović

TL;DR

This paper assesses the Markov bases framework for sampling from conditional distributions in discrete exponential families, clarifying 25 years of development since the Fundamental Theorem of Markov Bases. It connects algebraic constructs (Markov bases, toric ideals, and Graver bases) to statistical fibers defined by $F(b)=\{u: Au=b,\ u\ge0\}$ and explores both positive (existence, connectivity) and negative (complexity, restricted fibers, non-negativity relaxations) results. New contributions include results on unbounded relaxation of fibers, the persistence of move complexity under relaxations, polynomial bounds for restricted fibers in certain hierarchical models, and limitations of incomplete move sets, especially in the no-three-way interaction model. The discussion situates algebraic advances within classical statistics and highlights practical strategies (dynamic Markov bases, SIS hybrids, mixing considerations) and software resources for implementing exact conditional tests on contingency tables and related models.

Abstract

In this paper, we evaluate the challenges and best practices associated with the Markov bases approach to sampling from conditional distributions. We provide insights and clarifications after 25 years of the publication of the fundamental theorem for Markov bases by Diaconis and Sturmfels. In addition to a literature review we prove three new results on the complexity of Markov bases in hierarchical models, relaxations of the fibers in log-linear models, and limitations of partial sets of moves in providing an irreducible Markov chain.

Markov bases: a 25 year update

TL;DR

This paper assesses the Markov bases framework for sampling from conditional distributions in discrete exponential families, clarifying 25 years of development since the Fundamental Theorem of Markov Bases. It connects algebraic constructs (Markov bases, toric ideals, and Graver bases) to statistical fibers defined by and explores both positive (existence, connectivity) and negative (complexity, restricted fibers, non-negativity relaxations) results. New contributions include results on unbounded relaxation of fibers, the persistence of move complexity under relaxations, polynomial bounds for restricted fibers in certain hierarchical models, and limitations of incomplete move sets, especially in the no-three-way interaction model. The discussion situates algebraic advances within classical statistics and highlights practical strategies (dynamic Markov bases, SIS hybrids, mixing considerations) and software resources for implementing exact conditional tests on contingency tables and related models.

Abstract

In this paper, we evaluate the challenges and best practices associated with the Markov bases approach to sampling from conditional distributions. We provide insights and clarifications after 25 years of the publication of the fundamental theorem for Markov bases by Diaconis and Sturmfels. In addition to a literature review we prove three new results on the complexity of Markov bases in hierarchical models, relaxations of the fibers in log-linear models, and limitations of partial sets of moves in providing an irreducible Markov chain.
Paper Structure (16 sections, 11 theorems, 28 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 16 sections, 11 theorems, 28 equations, 4 figures, 4 tables, 1 algorithm.

Key Result

Theorem 2.6

The set $\mathcal{M}\subset \ker_\mathbb Z A$ is a Markov basis for $A$ if and only if the corresponding set of binomials generates the toric ideal $I_A$.

Figures (4)

  • Figure 1: Acceptance probabilities for proposed Markov moves for fiber samplers using two types of moves: (Left) a Markov basis and (Right) a lattice basis, for 10 repeated runs of a Markov chain of default length 10,000 using the R package algstat.
  • Figure 2: Simulated $p$-values for the goodness-of-fit test of the independence model. The simulations are done using fiber samplers using two types of moves: (Left) a Markov basis and (Right) a lattice basis, for 100 repeated runs of a Markov chain of default length 10,000 using the R package algstat.
  • Figure 3: The basic move $b(i_1, i_2; j_1, j_2; k_1, k_2)$
  • Figure 4: Subsets of $[4]\times[6]\times[3]$ with staircase shape.

Theorems & Definitions (30)

  • Example 2.1
  • Definition 2.2
  • Example 2.3: \ref{['example: job data']}, continued
  • Example 2.4
  • Definition 2.5: Markov basis
  • Theorem 2.6: Fundamental theorem of Markov bases, DS98, Thm. 3.1
  • Example 2.7: \ref{['example: job data']}, continued
  • Remark 2.8
  • Definition 3.1
  • Definition 3.2: Graver basis
  • ...and 20 more