Table of Contents
Fetching ...

A Riemannian Optimization Approach for Finding the Nearest Reversible Markov Chain

Fabio Durastante, Miryam Gnazzo, Beatrice Meini

TL;DR

This work targets the problem of approximating a given Markov chain by a reversible one with the same stationary distribution, minimizing the Frobenius distance between transition matrices. It introduces a Riemannian-optimization approach on a modified multinomial manifold \mathcal{M}_{\boldsymbol{\pi}} endowed with the Fisher information metric, transforming the problem into a symmetric, positive-definite form and solving it with second-order methods. The method naturally handles transient states and decomposes reducible chains into ergodic classes, enabling blockwise optimization and substantial speedups. Compared with quadratic programming, the Riemannian approach achieves close-to-machine-precision reversibility with lower memory and substantially faster runtimes, offering a practical tool for constructing reversible operators in MCMC, molecular dynamics, and data-driven transfer operators. The paper also discusses limitations for sparse matrices and points to future work on lifting techniques and KL-divergence formulations to broaden applicability.

Abstract

We address the algorithmic problem of determining the reversible Markov chain $\tilde X$ that is closest to a given Markov chain $X$, with an identical stationary distribution. More specifically, $\tilde X$ is the reversible Markov chain with the closest transition matrix, in the Frobenius norm, to the transition matrix of $X$. To compute the transition matrix of $\tilde X$, we propose a novel approach based on Riemannian optimization. Our method introduces a modified multinomial manifold endowed with a prescribed stationary vector, while also satisfying the detailed balance conditions, all within the framework of the Fisher metric. We evaluate the performance of the proposed approach in comparison with an existing quadratic programming method and demonstrate its effectiveness through a series of synthetic experiments, as well as in the construction of a reversible Markov chain from transition count data obtained via direct estimation from a stochastic differential equation.

A Riemannian Optimization Approach for Finding the Nearest Reversible Markov Chain

TL;DR

This work targets the problem of approximating a given Markov chain by a reversible one with the same stationary distribution, minimizing the Frobenius distance between transition matrices. It introduces a Riemannian-optimization approach on a modified multinomial manifold \mathcal{M}_{\boldsymbol{\pi}} endowed with the Fisher information metric, transforming the problem into a symmetric, positive-definite form and solving it with second-order methods. The method naturally handles transient states and decomposes reducible chains into ergodic classes, enabling blockwise optimization and substantial speedups. Compared with quadratic programming, the Riemannian approach achieves close-to-machine-precision reversibility with lower memory and substantially faster runtimes, offering a practical tool for constructing reversible operators in MCMC, molecular dynamics, and data-driven transfer operators. The paper also discusses limitations for sparse matrices and points to future work on lifting techniques and KL-divergence formulations to broaden applicability.

Abstract

We address the algorithmic problem of determining the reversible Markov chain that is closest to a given Markov chain , with an identical stationary distribution. More specifically, is the reversible Markov chain with the closest transition matrix, in the Frobenius norm, to the transition matrix of . To compute the transition matrix of , we propose a novel approach based on Riemannian optimization. Our method introduces a modified multinomial manifold endowed with a prescribed stationary vector, while also satisfying the detailed balance conditions, all within the framework of the Fisher metric. We evaluate the performance of the proposed approach in comparison with an existing quadratic programming method and demonstrate its effectiveness through a series of synthetic experiments, as well as in the construction of a reversible Markov chain from transition count data obtained via direct estimation from a stochastic differential equation.

Paper Structure

This paper contains 12 sections, 7 theorems, 85 equations, 8 figures, 1 algorithm.

Key Result

Theorem 2.10

Let $\mathcal{M}$ be an embedded manifold of the Euclidean space $\mathcal{E}$ and let $\mathcal{N}$ be another manifold such that Assume there exists a diffeomorphism where $\mathcal{E}^*$ is an open subset of $\mathcal{E}$, and suppose there exists a neutral element $I\in\mathcal{N}$ satisfying Then, the mapping where $\pi_1\colon \mathcal{M}\times\mathcal{N}\to\mathcal{M}$ is the projection

Figures (8)

  • Figure 1: Example of application of the Algorithm \ref{['alg:nearest_reversible']} to the Markov chain built from the graph containing the observed attendance at 14 social events by 18 Southern womendavis1941deep.
  • Figure 2: Examples of test problems: the stochastic matrices generated from uniformly random entries (see Figure \ref{['fig:uniform_random']}), the chain derived from normally distributed random entries (see Figure \ref{['fig:normal_random']}), the chain constructed using a stochastic block model (see Figure \ref{['fig:sbm_model']}), and the chain with multiple ergodic classes (see Figure \ref{['fig:mergodic_model']}).
  • Figure 3: The two plots represent the metric $\| D_{\boldsymbol{\pi}}P - P^\top D_{\boldsymbol{\pi}} \|_\infty$ for $P$, the nearest reversible matrix computed via the different algorithms. The left plot shows the performance profile, while the right plot provides detailed results for each experiment.
  • Figure 4: The two plots represent the metric $\| \boldsymbol{\pi}^\top P - \boldsymbol{\pi}^\top \|_\infty$ for $P$, the nearest reversible matrix computed via the different algorithms. The left plot shows the performance profile, while the right plot provides detailed results for each experiment.
  • Figure 5: The two plots represent the relative Frobenius norm distance to the nearest reversible matrix computed via the different algorithms. The left plot shows the performance profile, while the right plot provides detailed results for each experiment.
  • ...and 3 more figures

Theorems & Definitions (26)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Definition 2.4
  • Definition 2.5
  • Definition 2.6
  • Definition 2.7
  • Definition 2.8
  • Definition 2.9
  • Theorem 2.10: Proposition 4.1.2AbsilBook
  • ...and 16 more