Self-interacting processes via Doob conditioning
Francesco Coghi, Juan P. Garrahan
TL;DR
This work shows that self-interacting processes with dynamics conditioned on trajectorywise occupation measures can be understood as Doob-optimal transforms of underlying Markov processes. By recasting conditioning as a nonlocal, multi-time constraint and employing a tensor-network formalism, the authors derive that the conditioned dynamics are themselves self-interacting Markov processes, formally realized as Doob dynamics in an extended state space. They illustrate the framework with random walk bridges, excursions, and forced excursions, deriving time-dependent forces and value-function recursions that implement the conditioning. The approach provides a unifying perspective on memory effects in stochastic processes and opens avenues for connections to reinforcement learning and open quantum dynamics, with practical tools to construct conditioned ensembles efficiently.
Abstract
We connect self-interacting processes, that is, stochastic processes where transitions depend on the time spent by a trajectory in each configuration, to Doob conditioning. In this way we demonstrate that Markov processes with constrained occupation measures are realised optimally by self-interacting dynamics. We use a tensor network framework to guide our derivations. We illustrate our general results with new perspectives on well-known examples of self-interacting processes, such as random walk bridges, excursions, and forced excursions.
