Stochastic $p$th root approximation of a stochastic matrix: A Riemannian optimization approach

Fabio Durastante; Beatrice Meini

Stochastic $p$th root approximation of a stochastic matrix: A Riemannian optimization approach

Fabio Durastante, Beatrice Meini

TL;DR

The paper addresses the problem of approximating the $p$th root of a stochastic matrix by a stochastic matrix, leveraging Riemannian optimization on manifolds of positive stochastic matrices. It introduces two complementary approaches: (i) optimization on the multinomial manifold $ S_n$ to minimize $ frac{1}{2}\|X^p-A ight\|_F^2$, and (ii) optimization on the new manifold $ S_n^{oldsymbol{oldsymbol{ u}}}$ of matrices sharing the stationary distribution with $A$, ensuring the root preserves the stationary vector. The authors derive the necessary tangent spaces, projections, Riemannian gradients, and retractions (notably a Sinkhorn-based retraction), implement the methods in MATLAB/Manopt, and demonstrate that the Riemannian methods typically outperform constrained optimization in speed and accuracy, with the pi-preserving variant offering stationarity guarantees at a controlled cost. They also provide careful treatment of computational challenges, including the linear systems that arise in projections, spectral bounds, and preconditioning strategies, and show applicability to reducible chains through perturbation techniques. Overall, the work advances stochastic matrix embeddings by enabling efficient, structure-preserving approximations with practical impact on Markov-chain modeling and embedding problems.

Abstract

We propose two approaches, based on Riemannian optimization, for computing a stochastic approximation of the $p$th root of a stochastic matrix $A$. In the first approach, the approximation is found in the Riemannian manifold of positive stochastic matrices. In the second approach, we introduce the Riemannian manifold of positive stochastic matrices sharing with $A$ the Perron eigenvector and we compute the approximation of the $p$th root of $A$ in such a manifold. This way, differently from the available methods based on constrained optimization, $A$ and its $p$th root approximation share the Perron eigenvector. Such a property is relevant, from a modelling point of view, in the embedding problem for Markov chains. The extended numerical experimentation shows that, in the first approach, the Riemannian optimization methods are generally faster and more accurate than the available methods based on constrained optimization. In the second approach, even though the stochastic approximation of the $p$th root is found in a smaller set, the approximation is generally more accurate than the one obtained by standard constrained optimization.

Stochastic $p$th root approximation of a stochastic matrix: A Riemannian optimization approach

TL;DR

The paper addresses the problem of approximating the

th root of a stochastic matrix by a stochastic matrix, leveraging Riemannian optimization on manifolds of positive stochastic matrices. It introduces two complementary approaches: (i) optimization on the multinomial manifold

to minimize

, and (ii) optimization on the new manifold

of matrices sharing the stationary distribution with

, ensuring the root preserves the stationary vector. The authors derive the necessary tangent spaces, projections, Riemannian gradients, and retractions (notably a Sinkhorn-based retraction), implement the methods in MATLAB/Manopt, and demonstrate that the Riemannian methods typically outperform constrained optimization in speed and accuracy, with the pi-preserving variant offering stationarity guarantees at a controlled cost. They also provide careful treatment of computational challenges, including the linear systems that arise in projections, spectral bounds, and preconditioning strategies, and show applicability to reducible chains through perturbation techniques. Overall, the work advances stochastic matrix embeddings by enabling efficient, structure-preserving approximations with practical impact on Markov-chain modeling and embedding problems.

Abstract

We propose two approaches, based on Riemannian optimization, for computing a stochastic approximation of the

th root of a stochastic matrix

. In the first approach, the approximation is found in the Riemannian manifold of positive stochastic matrices. In the second approach, we introduce the Riemannian manifold of positive stochastic matrices sharing with

the Perron eigenvector and we compute the approximation of the

th root of

in such a manifold. This way, differently from the available methods based on constrained optimization,

and its

th root approximation share the Perron eigenvector. Such a property is relevant, from a modelling point of view, in the embedding problem for Markov chains. The extended numerical experimentation shows that, in the first approach, the Riemannian optimization methods are generally faster and more accurate than the available methods based on constrained optimization. In the second approach, even though the stochastic approximation of the

th root is found in a smaller set, the approximation is generally more accurate than the one obtained by standard constrained optimization.

Paper Structure (14 sections, 13 theorems, 103 equations, 8 figures, 2 tables)

This paper contains 14 sections, 13 theorems, 103 equations, 8 figures, 2 tables.

Introduction
Notation
Preliminaries on Riemannian optimization
Optimization methods
Stochastic pth root approximation via Riemannian optimization
Stochastic pth root approximation preserving the stationary distribution
A new Riemannian manifold
Computational issues
Numerical Examples
Stochastic pth root approximation via Riemannian optimization
Stochastic pth root approximation preserving the stationary distribution
Properties and solution of the associated linear systems
The case of reducible Markov chains
Conclusions and future directions

Key Result

theorem 1

Let $\mathcal{M}$ be an embedded submanifold of $\mathcal{E}$. Consider $x \in \mathcal{M}$ and the set $\mathcal{T}_x \mathcal{M}$ from Definition def:tangent_vector. If $\mathcal{M}$ is an open submanifold, then $\mathcal{T}_x \mathcal{M}=\mathcal{E}$. Otherwise $\mathcal{T}_x \mathcal{M} = \ker \

Figures (8)

Figure 1: Tangent space (opaque blue) of a bi-dimensional manifold embedded (red) in $\mathbb{R}^3$. The tangent space $\mathcal{T}_x\mathcal{M}$ is computed by taking derivatives of the curves $x_0(t)$ and $x_1(t)$ (red dotted lines) going through ${A}$ at the origin.
Figure 2: The basic idea of Riemannian optimization algorithms: evolving the iterates using local pull-back (retraction) from the tangent space to the manifold.
Figure 3: Approximate stochastic square root for $A$ the normalized stochastic matrix obtained from of SuiteSparse matrix collection. We depict the difference in the entries of the stationary distribution of the original matrix $A$ and its approximated root.
Figure 4: Performance comparison for stochastic $p$th root approximation via Riemannian optimization and constrained optimization with the formulation \ref{['alg:a2']}. The test matrices are $40$ stochastic matrices from each class in Table \ref{['tab:matrix-classes']}.
Figure 5: On the left panel we report the error in Frobenius norm $\|X^p - A\|_F$ obtained by applying the trustregions method with initial guess matlabX0 = M.rand() and tolerance of the gradient of matlab1e-4 for a matrix generated from the classes in Table \ref{['tab:matrix-classes']}. The right panel reports the infinity norm error between $\boldsymbol{\pi}^T A = \boldsymbol{\pi}^T$ and $\tilde{\boldsymbol{\pi}}^T A = \tilde{\boldsymbol{\pi}}^T$, i.e., $\|\tilde{\boldsymbol{\pi}} - \boldsymbol{\pi}\|_\infty$. The dashed line represents the floating-point relative accuracy of MATLAB's double-precision number.
...and 3 more figures

Theorems & Definitions (27)

definition 1: Embedded Manifold
definition 2: Tangent vector, tangent bundle
theorem 1: MR4533407
definition 3: Affine connection
definition 4: Riemannian Gradient and Hessian
definition 5: Retraction
theorem 2: AbsilBook
lemma 1
proof
lemma 2
...and 17 more

Stochastic $p$th root approximation of a stochastic matrix: A Riemannian optimization approach

TL;DR

Abstract

Stochastic $p$th root approximation of a stochastic matrix: A Riemannian optimization approach

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (27)