Table of Contents
Fetching ...

Stochastic $p$th root approximation of a stochastic matrix: A Riemannian optimization approach

Fabio Durastante, Beatrice Meini

TL;DR

The paper addresses the problem of approximating the $p$th root of a stochastic matrix by a stochastic matrix, leveraging Riemannian optimization on manifolds of positive stochastic matrices. It introduces two complementary approaches: (i) optimization on the multinomial manifold $ S_n$ to minimize $ frac{1}{2}\|X^p-A ight\|_F^2$, and (ii) optimization on the new manifold $ S_n^{oldsymbol{oldsymbol{ u}}}$ of matrices sharing the stationary distribution with $A$, ensuring the root preserves the stationary vector. The authors derive the necessary tangent spaces, projections, Riemannian gradients, and retractions (notably a Sinkhorn-based retraction), implement the methods in MATLAB/Manopt, and demonstrate that the Riemannian methods typically outperform constrained optimization in speed and accuracy, with the pi-preserving variant offering stationarity guarantees at a controlled cost. They also provide careful treatment of computational challenges, including the linear systems that arise in projections, spectral bounds, and preconditioning strategies, and show applicability to reducible chains through perturbation techniques. Overall, the work advances stochastic matrix embeddings by enabling efficient, structure-preserving approximations with practical impact on Markov-chain modeling and embedding problems.

Abstract

We propose two approaches, based on Riemannian optimization, for computing a stochastic approximation of the $p$th root of a stochastic matrix $A$. In the first approach, the approximation is found in the Riemannian manifold of positive stochastic matrices. In the second approach, we introduce the Riemannian manifold of positive stochastic matrices sharing with $A$ the Perron eigenvector and we compute the approximation of the $p$th root of $A$ in such a manifold. This way, differently from the available methods based on constrained optimization, $A$ and its $p$th root approximation share the Perron eigenvector. Such a property is relevant, from a modelling point of view, in the embedding problem for Markov chains. The extended numerical experimentation shows that, in the first approach, the Riemannian optimization methods are generally faster and more accurate than the available methods based on constrained optimization. In the second approach, even though the stochastic approximation of the $p$th root is found in a smaller set, the approximation is generally more accurate than the one obtained by standard constrained optimization.

Stochastic $p$th root approximation of a stochastic matrix: A Riemannian optimization approach

TL;DR

The paper addresses the problem of approximating the th root of a stochastic matrix by a stochastic matrix, leveraging Riemannian optimization on manifolds of positive stochastic matrices. It introduces two complementary approaches: (i) optimization on the multinomial manifold to minimize , and (ii) optimization on the new manifold of matrices sharing the stationary distribution with , ensuring the root preserves the stationary vector. The authors derive the necessary tangent spaces, projections, Riemannian gradients, and retractions (notably a Sinkhorn-based retraction), implement the methods in MATLAB/Manopt, and demonstrate that the Riemannian methods typically outperform constrained optimization in speed and accuracy, with the pi-preserving variant offering stationarity guarantees at a controlled cost. They also provide careful treatment of computational challenges, including the linear systems that arise in projections, spectral bounds, and preconditioning strategies, and show applicability to reducible chains through perturbation techniques. Overall, the work advances stochastic matrix embeddings by enabling efficient, structure-preserving approximations with practical impact on Markov-chain modeling and embedding problems.

Abstract

We propose two approaches, based on Riemannian optimization, for computing a stochastic approximation of the th root of a stochastic matrix . In the first approach, the approximation is found in the Riemannian manifold of positive stochastic matrices. In the second approach, we introduce the Riemannian manifold of positive stochastic matrices sharing with the Perron eigenvector and we compute the approximation of the th root of in such a manifold. This way, differently from the available methods based on constrained optimization, and its th root approximation share the Perron eigenvector. Such a property is relevant, from a modelling point of view, in the embedding problem for Markov chains. The extended numerical experimentation shows that, in the first approach, the Riemannian optimization methods are generally faster and more accurate than the available methods based on constrained optimization. In the second approach, even though the stochastic approximation of the th root is found in a smaller set, the approximation is generally more accurate than the one obtained by standard constrained optimization.
Paper Structure (14 sections, 13 theorems, 103 equations, 8 figures, 2 tables)

This paper contains 14 sections, 13 theorems, 103 equations, 8 figures, 2 tables.

Key Result

theorem 1

Let $\mathcal{M}$ be an embedded submanifold of $\mathcal{E}$. Consider $x \in \mathcal{M}$ and the set $\mathcal{T}_x \mathcal{M}$ from Definition def:tangent_vector. If $\mathcal{M}$ is an open submanifold, then $\mathcal{T}_x \mathcal{M}=\mathcal{E}$. Otherwise $\mathcal{T}_x \mathcal{M} = \ker \

Figures (8)

  • Figure 1: Tangent space (opaque blue) of a bi-dimensional manifold embedded (red) in $\mathbb{R}^3$. The tangent space $\mathcal{T}_x\mathcal{M}$ is computed by taking derivatives of the curves $x_0(t)$ and $x_1(t)$ (red dotted lines) going through ${A}$ at the origin.
  • Figure 2: The basic idea of Riemannian optimization algorithms: evolving the iterates using local pull-back (retraction) from the tangent space to the manifold.
  • Figure 3: Approximate stochastic square root for $A$ the normalized stochastic matrix obtained from of SuiteSparse matrix collection. We depict the difference in the entries of the stationary distribution of the original matrix $A$ and its approximated root.
  • Figure 4: Performance comparison for stochastic $p$th root approximation via Riemannian optimization and constrained optimization with the formulation \ref{['alg:a2']}. The test matrices are $40$ stochastic matrices from each class in Table \ref{['tab:matrix-classes']}.
  • Figure 5: On the left panel we report the error in Frobenius norm $\|X^p - A\|_F$ obtained by applying the trustregions method with initial guess matlabX0 = M.rand() and tolerance of the gradient of matlab1e-4 for a matrix generated from the classes in Table \ref{['tab:matrix-classes']}. The right panel reports the infinity norm error between $\boldsymbol{\pi}^T A = \boldsymbol{\pi}^T$ and $\tilde{\boldsymbol{\pi}}^T A = \tilde{\boldsymbol{\pi}}^T$, i.e., $\|\tilde{\boldsymbol{\pi}} - \boldsymbol{\pi}\|_\infty$. The dashed line represents the floating-point relative accuracy of MATLAB's double-precision number.
  • ...and 3 more figures

Theorems & Definitions (27)

  • definition 1: Embedded Manifold
  • definition 2: Tangent vector, tangent bundle
  • theorem 1: MR4533407
  • definition 3: Affine connection
  • definition 4: Riemannian Gradient and Hessian
  • definition 5: Retraction
  • theorem 2: AbsilBook
  • lemma 1
  • proof
  • lemma 2
  • ...and 17 more