Table of Contents
Fetching ...

Riemannian Manifold Learning for Stackelberg Games with Neural Flow Representations

Larkin Liu, Kashif Rasul, Yutong Chao, Jalal Etesami

TL;DR

This work addresses online learning in two-player Stackelberg general-sum games by mapping the joint action space to a smooth spherical Stackelberg manifold using neural normalizing flows. This embedding enables geodesic-online optimization and a linear-bandit treatment of rewards on the manifold, with theoretical regret guarantees for learning Stackelberg equilibria. The authors introduce the Geodesic Isoplanar Subspace Alignment (GISA) framework, provide a bilevel-optimization-based analysis under parameter uncertainty, and validate the approach with empirical experiments in domains like cybersecurity and supply chain optimization. The combination of manifold learning with Stackelberg game theory offers improved computational efficiency, scalable equilibrium learning, and principled uncertainty handling, highlighting neural normalizing flows as a novel tool for multi-agent online learning on manifolds.

Abstract

We present a novel framework for online learning in Stackelberg general-sum games, where two agents, the leader and follower, engage in sequential turn-based interactions. At the core of this approach is a learned diffeomorphism that maps the joint action space to a smooth spherical Riemannian manifold, referred to as the Stackelberg manifold. This mapping, facilitated by neural normalizing flows, ensures the formation of tractable isoplanar subspaces, enabling efficient techniques for online learning. Leveraging the linearity of the agents' reward functions on the Stackelberg manifold, our construct allows the application of linear bandit algorithms. We then provide a rigorous theoretical basis for regret minimization on the learned manifold and establish bounds on the simple regret for learning Stackelberg equilibrium. This integration of manifold learning into game theory uncovers a previously unrecognized potential for neural normalizing flows as an effective tool for multi-agent learning. We present empirical results demonstrating the effectiveness of our approach compared to standard baselines, with applications spanning domains such as cybersecurity and economic supply chain optimization.

Riemannian Manifold Learning for Stackelberg Games with Neural Flow Representations

TL;DR

This work addresses online learning in two-player Stackelberg general-sum games by mapping the joint action space to a smooth spherical Stackelberg manifold using neural normalizing flows. This embedding enables geodesic-online optimization and a linear-bandit treatment of rewards on the manifold, with theoretical regret guarantees for learning Stackelberg equilibria. The authors introduce the Geodesic Isoplanar Subspace Alignment (GISA) framework, provide a bilevel-optimization-based analysis under parameter uncertainty, and validate the approach with empirical experiments in domains like cybersecurity and supply chain optimization. The combination of manifold learning with Stackelberg game theory offers improved computational efficiency, scalable equilibrium learning, and principled uncertainty handling, highlighting neural normalizing flows as a novel tool for multi-agent online learning on manifolds.

Abstract

We present a novel framework for online learning in Stackelberg general-sum games, where two agents, the leader and follower, engage in sequential turn-based interactions. At the core of this approach is a learned diffeomorphism that maps the joint action space to a smooth spherical Riemannian manifold, referred to as the Stackelberg manifold. This mapping, facilitated by neural normalizing flows, ensures the formation of tractable isoplanar subspaces, enabling efficient techniques for online learning. Leveraging the linearity of the agents' reward functions on the Stackelberg manifold, our construct allows the application of linear bandit algorithms. We then provide a rigorous theoretical basis for regret minimization on the learned manifold and establish bounds on the simple regret for learning Stackelberg equilibrium. This integration of manifold learning into game theory uncovers a previously unrecognized potential for neural normalizing flows as an effective tool for multi-agent learning. We present empirical results demonstrating the effectiveness of our approach compared to standard baselines, with applications spanning domains such as cybersecurity and economic supply chain optimization.

Paper Structure

This paper contains 75 sections, 10 theorems, 121 equations, 17 figures, 3 tables, 4 algorithms.

Key Result

Lemma 2.1

Linear Relation for Smooth Invertible Maps: Suppose that $Y$ can be expressed as $Y = \langle \tilde{\theta}, \tilde{\phi}(X) \rangle$, where $\tilde{\phi}: \mathbb{R}^d \to \mathbb{R}^d$ is smooth and bijective, and $\tilde{\theta} \in \mathbb{R}^d$. Then, for any $k \geq d$, there exists an altern

Figures (17)

  • Figure 1: Bipartite & Bijective Neural Flow Architecture: We illustrate the bipartite structure of the normalizing flow architecture. Two players present joint actions $(\mathbf{a}, \mathbf{b})$, where each vector is mapped separately through a series of bijective transforms consisting of normalizing flow layers. Each player's action independently controls one subspace of the spherical manifold. The sequence of bijective transformations retain a fully bijective network from the ambient joint action space to the manifold space $\mathbf{\Phi}(\mathbf{a}, \mathbf{b})$. The network is invertible by design, and features a bipartite input.
  • Figure 2: Isoplanar subspaces for players A and B.
  • Figure 3: Geodesic confidence balls for players A and B.
  • Figure 4: Average cumulative regret performance across three Stackelberg games. Uncertainty region denote upper and lower quartile of experimental results. (Parameters of the simulations are outlined in Appendices \ref{['sec:r1_game_details']} - \ref{['sec:multi-dim-ssg']}.)
  • Figure 9: The Newsvendor Pricing Game. From liu:2024_stacknews_adt, in this Stackelberg game, there a logistics network between a supplier (leader) and retailer (follower), where utility functions are not necessarily supermodular, the supplier issues a wholesale price $a$, and the retailer issues a purchase quantity $b$, and a retail price $p$ in response.
  • ...and 12 more figures

Theorems & Definitions (25)

  • Definition 2.1
  • Lemma 2.1
  • Lemma 4.1
  • Definition 4.1
  • Definition 4.2
  • Lemma 4.2
  • Lemma 4.3
  • Theorem 1
  • Definition C.1
  • Definition C.2
  • ...and 15 more