Computing Stationary Distribution via Dirichlet-Energy Minimization by Coordinate Descent

Konstantin Avrachenkov; Lorenzo Gregoris; Nelly Litvak

Computing Stationary Distribution via Dirichlet-Energy Minimization by Coordinate Descent

Konstantin Avrachenkov, Lorenzo Gregoris, Nelly Litvak

TL;DR

An optimization-based formulation of the Red Light Green Light algorithm for computing stationary distributions of large Markov chains clarifies the algorithm's behavior, establishes exponential convergence for a class of chains, and suggests practical scheduling strategies to accelerate convergence.

Abstract

We present an optimization-based formulation of the Red Light Green Light (RLGL) algorithm for computing stationary distributions of large Markov chains. This perspective clarifies the algorithm's behavior, establishes exponential convergence for a class of chains, and suggests practical scheduling strategies to accelerate convergence.

Computing Stationary Distribution via Dirichlet-Energy Minimization by Coordinate Descent

TL;DR

Abstract

Paper Structure (22 sections, 21 theorems, 140 equations, 11 figures, 4 tables)

This paper contains 22 sections, 21 theorems, 140 equations, 11 figures, 4 tables.

Introduction
Contributions.
Outline of the paper.
Preliminaries
Notation and conventions
Markov chains and stationary distribution
RLGL algorithm
Dirichlet energy and the graph Laplacian
Optimization with coordinate descent
Coordinate Descent for Reversible Chains
Finding an energy function for RLGL
RLGL as block descent
When coordinate descent outperforms power iteration
Extension to Nearly Reversible Chains
Irreversible chains as perturbed coordinate descent
...and 7 more sections

Key Result

Proposition 1

There exist constants $0 < c_1 \leq c_2 < +\infty$ such that

Figures (11)

Figure 1: Comparison of convergence rates of Coordinate Descent and Power iteration, for $n=100$. Plotted difference $r_{CD} - r_{PI}$. The blue region is where $n$ iterations of CD beat one PI, in red the converse. The black line shows when $r_{CD} = r_{PI}$.
Figure 2: Stationary distribution computation of the Harvard500 lscc. Comparison of different residual rescalings for the Gauss-Southwell heuristic. On the $y$-axis the $\ell_1$ norm of the residual, on the $x$-axis the iteration number. Rescaling by $\sqrt{\hat{\pi}}$ results in faster convergence.
Figure 3: Stationary distribution on Harvard500 lscc.
Figure 4: PageRank on Harvard500 lscc.
Figure 5: Stationary distribution computation on web-edu.
...and 6 more figures

Theorems & Definitions (44)

Proposition 1
proof
Definition 1: $\mu$-PL function, wright2015coordinatedescentalgorithms
Lemma 1
proof
Theorem 1
Theorem 2
Lemma 2
proof
Theorem 3
...and 34 more

Computing Stationary Distribution via Dirichlet-Energy Minimization by Coordinate Descent

TL;DR

Abstract

Computing Stationary Distribution via Dirichlet-Energy Minimization by Coordinate Descent

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (44)