Markov Chains with Rewinding

Amir Azarmehr; Soheil Behnezhad; Alma Ghafari; Madhu Sudan

Markov Chains with Rewinding

Amir Azarmehr, Soheil Behnezhad, Alma Ghafari, Madhu Sudan

TL;DR

This work introduces Markov chains with rewinding, a model where a partially observable Markov chain interacts with an algorithm that can rewind to prior times. Focusing on identifying the initial state, the authors compare adaptive and non-adaptive rewinding, establishing that without efficiency constraints the two powers coincide, but with a polynomial gap emerges when efficiency is measured by query complexity. They develop a polynomially efficient non-adaptive algorithm based on a partition-graph framework and multiplicative path costs, and they prove a near-optimal adaptive strategy that achieves significantly better query complexity in certain constructions. A key contribution is the demonstration that canonical p.o. Markov chains suffice to capture the difficulty of the general problem, via reductions that preserve adaptive and non-adaptive complexity up to polylogarithmic factors. Overall, the results provide a structured toolkit for analyzing rewinding strategies and connect to sublinear-time lower bounds in graph problems, offering a quantitative understanding of when adaptivity helps in rewinding-enabled processes.

Abstract

Motivated by techniques developed in recent progress on lower bounds for sublinear time algorithms (Behnezhad, Roghani and Rubinstein, STOC 2023, FOCS 2023, and STOC 2024) we introduce and study a new class of randomized algorithmic processes that we call Markov Chains with Rewinding. In this setting, an algorithm interacts with a (partially observable) Markovian random evolution by strategically rewinding the Markov chain to previous states. Depending on the application, this may lead the evolution to desired states faster, or allow the agent to efficiently learn or test properties of the underlying Markov chain that may be infeasible or inefficient with passive observation. We study the task of identifying the initial state in a given partially observable Markov chain. Analysis of this question in specific Markov chains is the central ingredient in the above cited works and we aim to systematize the analysis in our work. Our first result is that any pair of states distinguishable with any rewinding strategy can also be distinguished with a non-adaptive rewinding strategy (one whose rewinding choices are determined before observing any outcomes of the chain). Therefore, while rewinding strategies can be shown to be strictly more powerful than passive strategies (those that do not rewind back to previous states), adaptivity does not give additional power to a rewinding strategy in the absence of efficiency considerations. The difference becomes apparent however when we introduce a natural efficiency measure, namely the query complexity (i.e., the number of observations they need to identify distinguishable states). Our second main contribution is to quantify this efficiency gap. We present a non-adaptive rewinding strategy whose query complexity is within a polynomial of that of the optimal (adaptive) strategy, and show that such a polynomial loss is necessary in general.

Markov Chains with Rewinding

TL;DR

Abstract

Paper Structure (24 sections, 15 theorems, 40 equations, 10 figures, 1 algorithm)

This paper contains 24 sections, 15 theorems, 40 equations, 10 figures, 1 algorithm.

Introduction
Markov chains with rewinding:
Our Contributions
Related Work
Connection to Sublinear-Time Graph Algorithms
Motivating Examples & Technical Overview
Example 1: The non-trivial power of (non-adaptive) rewinding
A simple algorithm with $\widetilde{O}(d^2)$ query complexity:
Improving query complexity to $O(d)$:
Example 2: The power of adaptivity
Technical Overview
The Formal Model
Partitions
A Polynomially Optimal Non-Adaptive Algorithm
A Non-Adaptive Algorithm with Polynomial Queries
...and 9 more sections

Key Result

Theorem 1.0

Given a partially observable Markov chain $M = (\Omega, P, O)$, there exists a non-adaptive algorithm and a constant $c = c(\lvert\Omega\rvert)$ that can distinguish between any two states $a, b \in \Omega$ in time

Figures (10)

Figure 2.1: A canonical Markov chain for which there is a (large) polynomial gap between adaptive and non-adaptive strategies of distinguishing states $q_1$ and $q_2$.
Figure 6.1: Represents a general Markov chain $M$ with observations $\Sigma = \{\sigma_1, \sigma_2, \sigma_3\}$, where $O_{s_1} = \sigma_1, O_{s_2} = \sigma_2$, and $O_{s_3} = \sigma_2$.
Figure 6.2: Represents a canonical Markov chain ${\hat{M}}$, with parameter $0<q<1$, where there is an injection $\phi$ that reduces distinguishing states $a$ and $a'$ in $M$ to distinguishing $\phi(a)$ and $\phi(a')$ in ${\hat{M}}$.
Figure :
Figure :
...and 5 more figures

Theorems & Definitions (39)

Theorem 1.0
Theorem 1.1
Definition 3.1: Markov Chains with Rewinding
Definition 3.2: Query Complexity
Definition 3.3: State Identification Algorithms
Definition 3.4: Canonical p.o. Markov Chains
Definition 3.5: Partition
Definition 3.6: Total Variation Distance w.r.t. Partitions
Lemma 3.7
proof
...and 29 more

Markov Chains with Rewinding

TL;DR

Abstract

Markov Chains with Rewinding

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (39)