Finite-horizon Approximations and Episodic Equilibrium for Stochastic Games

Muhammed O. Sayin

Finite-horizon Approximations and Episodic Equilibrium for Stochastic Games

Muhammed O. Sayin

TL;DR

This work tackles the challenge of analyzing stochastic games across horizons by introducing a finite-horizon approximation and the notion of episodic equilibrium, where strategies depend on the current state and episode stage. It provides rigorous guarantees: a bound on the approximation error that decays with the episode length $M$ for both time-averaged and discounted utilities, and learning dynamics that converge to (near) episodic equilibria in broad two-agent SG classes, including zero-sum, identical-interest, and certain general-sum games. The core mechanism is to view the infinite-horizon problem through a sequence of finite-horizon problems and to learn within episodic Markov strategies, with convergence supported by reductions to zero-sum/potential game structures and backward induction logic. The results offer a unified, decentralized, model-free pathway to equilibrium in SGs with practical relevance to RL applications where periodic behavior and horizon-limited planning are natural.

Abstract

This paper proposes a finite-horizon approximation scheme and introduces episodic equilibrium as a solution concept for stochastic games (SGs), where agents strategize based on the current state and episode stage. The paper also establishes an upper bound on the approximation error that decays with the episode length for both discounted and time-averaged utilities. This approach bridges the gap in the analysis of finite and infinite-horizon SGs, and provides a unifying framework to address time-averaged and discounted utilities. To show the effectiveness of the scheme, the paper presents episodic, decentralized (i.e., payoff-based), and model-free learning dynamics proven to reach (near) episodic equilibrium in broad classes of SGs, including zero-sum, identical-interest and specific general-sum SGs with switching controllers for both time-averaged and discounted utilities.

Finite-horizon Approximations and Episodic Equilibrium for Stochastic Games

TL;DR

for both time-averaged and discounted utilities, and learning dynamics that converge to (near) episodic equilibria in broad two-agent SG classes, including zero-sum, identical-interest, and certain general-sum games. The core mechanism is to view the infinite-horizon problem through a sequence of finite-horizon problems and to learn within episodic Markov strategies, with convergence supported by reductions to zero-sum/potential game structures and backward induction logic. The results offer a unified, decentralized, model-free pathway to equilibrium in SGs with practical relevance to RL applications where periodic behavior and horizon-limited planning are natural.

Abstract

Paper Structure (6 sections, 43 equations, 1 algorithm)

This paper contains 6 sections, 43 equations, 1 algorithm.

Introduction
Stochastic Games and Episodic Equilibrium
Finite-horizon Approximation
Episodic Individual Q-learning
Convergence Results
Conclusion

Finite-horizon Approximations and Episodic Equilibrium for Stochastic Games

TL;DR

Abstract

Finite-horizon Approximations and Episodic Equilibrium for Stochastic Games

Authors

TL;DR

Abstract

Table of Contents