Table of Contents
Fetching ...

Convergence of Some Convex Message Passing Algorithms to a Fixed Point

Vaclav Voracek, Tomas Werner

TL;DR

The paper addresses convergence questions for convex message-passing methods used in MAP inference, framing these methods as block-coordinate descent on a dual LP or Lagrangian relaxation. It analyzes coordinate descent on a piecewise-affine convex objective, proves the iterates converge to a fixed point with a rate of $\mathcal{O}(1/\varepsilon)$, and shows that prominent algorithms like Max-Sum Diffusion and Max-Marginal Averaging are special cases of this framework. A novel energy function is introduced to guarantee descent and non-cycling under boundedness, and a mid-point variant is shown to potentially cycle. The results provide rigorous fixed-point guarantees and convergence rates for widely used convex message-passing approaches in MAP inference and related combinatorial problems, clarifying their theoretical behavior and scope of applicability.

Abstract

A popular approach to the MAP inference problem in graphical models is to minimize an upper bound obtained from a dual linear programming or Lagrangian relaxation by (block-)coordinate descent. This is also known as convex/convergent message passing; examples are max-sum diffusion and sequential tree-reweighted message passing (TRW-S). Convergence properties of these methods are currently not fully understood. They have been proved to converge to the set characterized by local consistency of active constraints, with unknown convergence rate; however, it was not clear if the iterates converge at all (to any point). We prove a stronger result (conjectured before but never proved): the iterates converge to a fixed point of the method. Moreover, we show that the algorithm terminates within $\mathcal{O}(1/\varepsilon)$ iterations. We first prove this for a version of coordinate descent applied to a general piecewise-affine convex objective. Then we show that several convex message passing methods are special cases of this method. Finally, we show that a slightly different version of coordinate descent can cycle.

Convergence of Some Convex Message Passing Algorithms to a Fixed Point

TL;DR

The paper addresses convergence questions for convex message-passing methods used in MAP inference, framing these methods as block-coordinate descent on a dual LP or Lagrangian relaxation. It analyzes coordinate descent on a piecewise-affine convex objective, proves the iterates converge to a fixed point with a rate of , and shows that prominent algorithms like Max-Sum Diffusion and Max-Marginal Averaging are special cases of this framework. A novel energy function is introduced to guarantee descent and non-cycling under boundedness, and a mid-point variant is shown to potentially cycle. The results provide rigorous fixed-point guarantees and convergence rates for widely used convex message-passing approaches in MAP inference and related combinatorial problems, clarifying their theoretical behavior and scope of applicability.

Abstract

A popular approach to the MAP inference problem in graphical models is to minimize an upper bound obtained from a dual linear programming or Lagrangian relaxation by (block-)coordinate descent. This is also known as convex/convergent message passing; examples are max-sum diffusion and sequential tree-reweighted message passing (TRW-S). Convergence properties of these methods are currently not fully understood. They have been proved to converge to the set characterized by local consistency of active constraints, with unknown convergence rate; however, it was not clear if the iterates converge at all (to any point). We prove a stronger result (conjectured before but never proved): the iterates converge to a fixed point of the method. Moreover, we show that the algorithm terminates within iterations. We first prove this for a version of coordinate descent applied to a general piecewise-affine convex objective. Then we show that several convex message passing methods are special cases of this method. Finally, we show that a slightly different version of coordinate descent can cycle.
Paper Structure (9 sections, 8 theorems, 34 equations, 1 figure, 3 algorithms)

This paper contains 9 sections, 8 theorems, 34 equations, 1 figure, 3 algorithms.

Key Result

Proposition 2.7

Let Assumption technical_assumption hold. Let $c$ and $C$ be given by (eq:slopes). In every inner iteration of Algorithm alg:maxaff, the energy $E_{1+C/c}(Ax+b)$ decreases by at least $c\,|x_j - x_j^*|$.

Figures (1)

  • Figure 1: Plots of the functions $x_j\mapsto f(x)$ (in blue) and $x_j\mapsto g_j(x)$ (red) for the first update in Example \ref{['ex:nonunique']} (so that $x_2=1$). Also shown are the three constituent affine functions (black).

Theorems & Definitions (22)

  • Example 2.2
  • Example 2.3
  • Remark 2.4
  • Example 2.5: Werner-TR-2017-05
  • Definition 2.6
  • Proposition 2.7
  • proof
  • Lemma 2.8
  • proof
  • Theorem 2.9
  • ...and 12 more