Riemannian Manifold Learning for Stackelberg Games with Neural Flow Representations
Larkin Liu, Kashif Rasul, Yutong Chao, Jalal Etesami
TL;DR
This work addresses online learning in two-player Stackelberg general-sum games by mapping the joint action space to a smooth spherical Stackelberg manifold using neural normalizing flows. This embedding enables geodesic-online optimization and a linear-bandit treatment of rewards on the manifold, with theoretical regret guarantees for learning Stackelberg equilibria. The authors introduce the Geodesic Isoplanar Subspace Alignment (GISA) framework, provide a bilevel-optimization-based analysis under parameter uncertainty, and validate the approach with empirical experiments in domains like cybersecurity and supply chain optimization. The combination of manifold learning with Stackelberg game theory offers improved computational efficiency, scalable equilibrium learning, and principled uncertainty handling, highlighting neural normalizing flows as a novel tool for multi-agent online learning on manifolds.
Abstract
We present a novel framework for online learning in Stackelberg general-sum games, where two agents, the leader and follower, engage in sequential turn-based interactions. At the core of this approach is a learned diffeomorphism that maps the joint action space to a smooth spherical Riemannian manifold, referred to as the Stackelberg manifold. This mapping, facilitated by neural normalizing flows, ensures the formation of tractable isoplanar subspaces, enabling efficient techniques for online learning. Leveraging the linearity of the agents' reward functions on the Stackelberg manifold, our construct allows the application of linear bandit algorithms. We then provide a rigorous theoretical basis for regret minimization on the learned manifold and establish bounds on the simple regret for learning Stackelberg equilibrium. This integration of manifold learning into game theory uncovers a previously unrecognized potential for neural normalizing flows as an effective tool for multi-agent learning. We present empirical results demonstrating the effectiveness of our approach compared to standard baselines, with applications spanning domains such as cybersecurity and economic supply chain optimization.
