Table of Contents
Fetching ...

Algorithms for Models with Intractable Normalizing Functions

Murali Haran, Bokgyeong Kang, Jaewoo Park

Abstract

In this paper we discuss a well known computing problem -- inference for models with intractable normalizing functions. Models with intractable normalizing functions arise in a wide variety of areas, for instance network models, models for spatial data on lattices, spatial point processes, flexible models for count data and gene expression, and models for permutations. Simulating from these models for fixed parameter values is well studied, starting with work dating back seventy years to the origin of the Metropolis algorithm. On the other hand some of the most practical and theoretically justified algorithms for inference, particularly Bayesian inference, have only been developed within the past two decades. The most computationally efficient algorithms often do not have well developed theory and few if any approaches exist for assessing the quality of approximations based on them. For many problems even the best algorithms can be computationally infeasible. Hence, this is an exciting area of research with many open problems. We explain several key algorithms, providing connections and touching upon practical advantages and disadvantages of each, with some discussion of theoretical properties where they impact practice. We discuss an approach for assessing the accuracy of approximations produced by these algorithms; this diagnostic is particularly valuable for algorithm tuning. While our focus is largely on models with intractable normalizing functions, we also discuss algorithms that are more broadly applicable to models where the entire likelihood function is intractable; these methods are of course also applicable to intractable normalizing function problems.

Algorithms for Models with Intractable Normalizing Functions

Abstract

In this paper we discuss a well known computing problem -- inference for models with intractable normalizing functions. Models with intractable normalizing functions arise in a wide variety of areas, for instance network models, models for spatial data on lattices, spatial point processes, flexible models for count data and gene expression, and models for permutations. Simulating from these models for fixed parameter values is well studied, starting with work dating back seventy years to the origin of the Metropolis algorithm. On the other hand some of the most practical and theoretically justified algorithms for inference, particularly Bayesian inference, have only been developed within the past two decades. The most computationally efficient algorithms often do not have well developed theory and few if any approaches exist for assessing the quality of approximations based on them. For many problems even the best algorithms can be computationally infeasible. Hence, this is an exciting area of research with many open problems. We explain several key algorithms, providing connections and touching upon practical advantages and disadvantages of each, with some discussion of theoretical properties where they impact practice. We discuss an approach for assessing the accuracy of approximations produced by these algorithms; this diagnostic is particularly valuable for algorithm tuning. While our focus is largely on models with intractable normalizing functions, we also discuss algorithms that are more broadly applicable to models where the entire likelihood function is intractable; these methods are of course also applicable to intractable normalizing function problems.
Paper Structure (16 sections, 9 equations, 6 figures)

This paper contains 16 sections, 9 equations, 6 figures.

Figures (6)

  • Figure 1: Multi-color image data simulated from the Potts model with $K$ = 4 and $\theta = \log(1 + \sqrt{4})$.
  • Figure 2: ACD results for the Potts model example. (a) ACD applied to samples generated from DMH with different numbers $m$ of (inner) Swendsen-Wang updates. (b) ACD applied to samples generated from ABC-MCMC with different values $\epsilon$ of the threshold. (c) ACD applied to samples generated from LikeEm with different numbers $d$ of particles. The dashed horizontal line represents the threshold value for the diagnostic. The triangle/square and vertical lines show the empirical mean and 95% uncertainty interval, respectively, of 30 replications of the diagnostic. The red triangle and blue square indicate poor sample quality and good sample quality, respectively.
  • Figure 3: Florentine marriage network breiger1986cumulated consists of 16 vertices and 20 undirected edges.
  • Figure 4: ACD results for the ERGM example. (a) ACD applied to samples generated from ALR with different numbers $d$ of particles. (b) ACD applied to samples generated from DMH with different numbers $m$ of (inner) Gibbs updates. (c) ACD applied to samples generated from ABC-MCMC with different values $\epsilon$ of threshold. The triangle/square and vertical line show the empirical mean and 95% uncertainty interval, respectively, of 30 replications of the diagnostic. The red triangle and blue square indicate poor sample quality and good sample quality, respectively.
  • Figure 5: The estimated joint posterior density of $\theta_1$ and $\theta_2$ for each algorithm for the ERGM example.
  • ...and 1 more figures