Table of Contents
Fetching ...

Circular Belief Propagation for Approximate Probabilistic Inference

Vincent Bouttier, Renaud Jardri, Sophie Deneve

TL;DR

Circular Belief Propagation is proposed, an extension of BP which limits the detrimental effects of message reverberation caused by cycles by learning to detect and cancel spurious correlations and belief amplifications.

Abstract

Belief Propagation (BP) is a simple probabilistic inference algorithm, consisting of passing messages between nodes of a graph representing a probability distribution. Its analogy with a neural network suggests that it could have far-ranging applications for neuroscience and artificial intelligence. Unfortunately, it is only exact when applied to cycle-free graphs, which restricts the potential of the algorithm. In this paper, we propose Circular Belief Propagation (CBP), an extension of BP which limits the detrimental effects of message reverberation caused by cycles by learning to detect and cancel spurious correlations and belief amplifications. We show in numerical experiments involving binary probabilistic graphs that CBP far outperforms BP and reaches good performance compared to that of previously proposed algorithms.

Circular Belief Propagation for Approximate Probabilistic Inference

TL;DR

Circular Belief Propagation is proposed, an extension of BP which limits the detrimental effects of message reverberation caused by cycles by learning to detect and cancel spurious correlations and belief amplifications.

Abstract

Belief Propagation (BP) is a simple probabilistic inference algorithm, consisting of passing messages between nodes of a graph representing a probability distribution. Its analogy with a neural network suggests that it could have far-ranging applications for neuroscience and artificial intelligence. Unfortunately, it is only exact when applied to cycle-free graphs, which restricts the potential of the algorithm. In this paper, we propose Circular Belief Propagation (CBP), an extension of BP which limits the detrimental effects of message reverberation caused by cycles by learning to detect and cancel spurious correlations and belief amplifications. We show in numerical experiments involving binary probabilistic graphs that CBP far outperforms BP and reaches good performance compared to that of previously proposed algorithms.
Paper Structure (91 sections, 7 theorems, 99 equations, 16 figures, 1 table)

This paper contains 91 sections, 7 theorems, 99 equations, 16 figures, 1 table.

Key Result

Theorem 5.1

If for any induced operator norm $\lVert \cdot \rVert$ (sometimes called natural matrix norm), $\lVert A \rVert < 1$, then CBP has a unique fixed point and CBP converges to it with at least a linear rate.

Figures (16)

  • Figure 1: Belief Propagation and Circular Belief Propagation algorithms applied to a probabilistic graph. (A) The probability distribution $p(\mathbf{x})$ is represented by a factor graph with pairwise potentials $\psi_{ij}$ and unitary potentials $\psi_i$. (B) BP aims at estimating marginals $p_i(x_i)$ by exchanging messages in the graph. The message $m_{1 \to 2}$ depends on three components: the messages received by node $x_1$ from its neighbors except $x_2$, including the external message (see full black lines), and the interaction $\psi_{12}$. Estimated marginal $b_i(x_i)$ of a node $x_i$ is formed based on all messages received by the node. BP is not exact when applied to cyclic graphs, for two reasons. First, messages get counted multiple times: $m_{1 \to 2}$ naturally travels back to $x_1$ because of the cycle $x_1-x_2-x_3-x_4$. Second, opposite messages are correlated: $m_{1 \to 2}$ depends on $m_{5 \to 1}$ which depends on $m_{4 \to 5}$ which depends on $m_{1 \to 4}$ which depends on $m_{2 \to 1}$. (C) Contrary to BP, Circular BP (partially) takes $m_{2 \to 1}$ into account to compute $m_{1 \to 2}$. Parameter $\bm{\kappa}$ fights the belief amplifications caused by messages being reverberated. Parameters $\bm{\alpha}$ decorrelates opposite messages.
  • Figure 2: Results of Circular BP on Erdos-Renyi graphs. Estimated marginals on the test set for BP and Circular BP, for both unsupervised and supervised learning procedures. One point represents the belief of a node, on one of the test examples, and one of the 30 randomly generated graphs.
  • Figure 3: Comparison between various algorithms. Circular BP strongly outperforms the algorithms from the same family (Fractional BP, BP, Tree-Reweighted BP and Mean-Field) and the more complex and Double-Loop BP. CBP has comparable performance for dense graphs with another more complex algorithm, Loop-Corrected BP. The score measure is given in Equation \ref{['eq:def-score-measure']}.
  • Figure 4: Potential application of Circular BP: Denoising hand-written digits for computer vision. Noisy data is presented to the Hopfield network / BP / CBP, which reconstruct a denoised version of the signal.
  • Figure S1: Running the Belief Propagation (BP) algorithm and its variant, the Circular Belief Propagation (CBP) algorithm. (A) Example of acyclic graph taken for the simulation. In the example, the probability distribution corresponds to an Ising model: pairwise potentials $\psi_{ij}(x_i, x_j) \propto \exp(J_{ij} x_i x_j)$ and unitary potentials $\psi_i(x_i) \propto \exp(M_{\text{ext} \to i} x_i)$ where $x_i \in \{-1;+1\}$ (binary case). $J_{ij}$ is generated randomly ($\sim \mathcal{N}(0,3)$), as well as $M_{\text{ext} \to i}$ ($\sim \mathcal{N}(0,2)$). (B) The update function $f$ for both BP and CBP is a parametric sigmoidal function close to the hyperbolic tangent. The parameter $J_{ij}$ represents the level of trust between variables $x_i$ and $x_j$. (C) Belief Propagation is a message-passing algorithm which consists of running the update equation \ref{['eq:BP-message']} until convergence of the messages. The approximate marginals, or beliefs, are defined by Equation \ref{['eq:BP-belief']}. Here the beliefs found by BP are exact as the graph has no cycles. (D) The Circular Belief Propagation algorithm is a parametric generalization of the Belief Propagation algorithm with parameter $\alpha_{ij}$ assigned to each edge $(i,j)$ of the graph. It is identical to BP for $\bm{\alpha} = \bm{1}$. In the simulation, $\bm{\alpha}$ is taken uniformly over the edges, equal to $0.5$.
  • ...and 11 more figures

Theorems & Definitions (13)

  • Theorem 5.1
  • Theorem 5.2
  • Theorem 5.3
  • proof
  • Theorem D.1
  • proof
  • Corollary D.2
  • Theorem D.3
  • proof
  • Theorem D.4
  • ...and 3 more