Controlling the low-temperature Ising model using spatiotemporal Markov decision theory
M. C. de Jongh, Richard J. Boucherie, M. N. M. van Lieshout
TL;DR
The paper introduces the spatiotemporal Markov decision process (STMDP) to model sequential decision problems with spatial interactions and asynchronous dynamics, filling gaps left by factored frameworks. It applies STMDP to a finite 2D Ising model at low temperature evolving under Metropolis dynamics, with an external controller flipping spins at fixed adjustment times κ. To analyze control, the authors construct an auxiliary MDP whose states are local Hamiltonian minima, specifically configurations where plus-spins form a rectangle, and solve the Bellman equations recursively to reveal the structure of the optimal policy. They prove a phase-transition in the optimal policy at discount factor λ_c = 15/17 and show that, for large κ, the auxiliary MDP provides a faithful approximation to the original problem, with numerical experiments confirming that the policy derived from the auxiliary MDP speeds up nucleation more effectively than two heuristics. The work presents a general strategy for tackling high-dimensional STMDPs by reducing the state space to metastable minima and exploiting recursive Bellman analysis, with potential extensions to broader Gibbsian dynamics and other lattice-based control problems.
Abstract
We introduce the spatiotemporal Markov decision process (STMDP), a special type of Markov decision process that models sequential decision-making problems which are not only characterized by temporal, but also by spatial interaction structures. To illustrate the framework, we construct an STMDP inspired by the low-temperature two-dimensional Ising model on a finite, square lattice, evolving according to the Metropolis dynamics. We consider the situation in which an external decision maker aims to drive the system towards the all-plus configuration by flipping spins at specified moments in time. In order to analyze this problem, we construct an auxiliary MDP by means of a reduction of the configuration space to the local minima of the Hamiltonian. Leveraging the convenient form of this auxiliary MDP, we uncover the structure of the optimal policy by solving the Bellman equations in a recursive manner. Finally, we conduct a numerical study on the performance of the optimal policy obtained from the auxiliary MDP in the original Ising STMDP.
