Table of Contents
Fetching ...

Distances for Markov chains from sample streams

Sergio Calo, Anders Jonsson, Gergely Neu, Ludovic Schwartz, Javier Segovia-Aguas

TL;DR

This work proposes a stochastic optimization method that addresses this limitation and estimates bisimulation metrics based on sample access, without requiring explicit transition models, and solves the bisimulation metrics problem using a stochastic primal-dual optimization method.

Abstract

Bisimulation metrics are powerful tools for measuring similarities between stochastic processes, and specifically Markov chains. Recent advances have uncovered that bisimulation metrics are, in fact, optimal-transport distances, which has enabled the development of fast algorithms for computing such metrics with provable accuracy and runtime guarantees. However, these recent methods, as well as all previously known methods, assume full knowledge of the transition dynamics. This is often an impractical assumption in most real-world scenarios, where typically only sample trajectories are available. In this work, we propose a stochastic optimization method that addresses this limitation and estimates bisimulation metrics based on sample access, without requiring explicit transition models. Our approach is derived from a new linear programming (LP) formulation of bisimulation metrics, which we solve using a stochastic primal-dual optimization method. We provide theoretical guarantees on the sample complexity of the algorithm and validate its effectiveness through a series of empirical evaluations.

Distances for Markov chains from sample streams

TL;DR

This work proposes a stochastic optimization method that addresses this limitation and estimates bisimulation metrics based on sample access, without requiring explicit transition models, and solves the bisimulation metrics problem using a stochastic primal-dual optimization method.

Abstract

Bisimulation metrics are powerful tools for measuring similarities between stochastic processes, and specifically Markov chains. Recent advances have uncovered that bisimulation metrics are, in fact, optimal-transport distances, which has enabled the development of fast algorithms for computing such metrics with provable accuracy and runtime guarantees. However, these recent methods, as well as all previously known methods, assume full knowledge of the transition dynamics. This is often an impractical assumption in most real-world scenarios, where typically only sample trajectories are available. In this work, we propose a stochastic optimization method that addresses this limitation and estimates bisimulation metrics based on sample access, without requiring explicit transition models. Our approach is derived from a new linear programming (LP) formulation of bisimulation metrics, which we solve using a stochastic primal-dual optimization method. We provide theoretical guarantees on the sample complexity of the algorithm and validate its effectiveness through a series of empirical evaluations.

Paper Structure

This paper contains 36 sections, 17 theorems, 94 equations, 5 figures, 1 table, 3 algorithms.

Key Result

Proposition 1

The distribution $\mu$ is the induced occupancy coupling of a bicausal coupling $\pi \in \Pi_{\text{bc}}$ if and only if there exist $\lambda_{\mathcal{X}}\in\mathbb{R}^{\mathcal{Y}\mathcal{X}}_+$ and $\lambda_{\mathcal{Y}}\in\mathbb{R}^{\mathcal{X}\mathcal{Y}}_+$ such that the following equations h Furthermore, if the equations are satisfied for some $\mu$, $\lambda_{\mathcal{X}}$ and $\lambda_{\

Figures (5)

  • Figure 1: Encoder-decoder maps learned by the algorithm in a block Markov chain example ($n = 10$, $B=5$) for sample sizes $1000$, $10000$ and $100000$.
  • Figure 2: Model selection results for random walks and the pendulum environment
  • Figure 3: Distance matrices between instances after running SOMCOT for $1000$ and $10000$ steps, and the ground truth obtained via Sinkhorn Value Iteration.
  • Figure 4: The influence of the ratio between $\eta$ and $\beta$ on the convergence of SOMCOT for different chain sizes. Error and learning rates are shown on a logarithmic scale. To produce this plot, a decay rate of $a=0.001$ was used for $\eta$. No decay was applied on $\beta$.
  • Figure : SOMCOT

Theorems & Definitions (24)

  • Proposition 1
  • Theorem 1
  • Lemma 1
  • Proposition 2
  • Proposition 3
  • Proposition 4
  • Lemma 2
  • Lemma 3
  • proof
  • Lemma 4
  • ...and 14 more