Leveraging Graph Neural Networks and Multi-Agent Reinforcement Learning for Inventory Control in Supply Chains
Niki Kotecha, Antonio del Rio Chanona
TL;DR
The paper addresses inventory control in complex, uncertain supply chains by proposing a graph-based multi-agent reinforcement learning framework that uses MAPPO with a parameterized (s,S) inventory policy. By representing the supply chain as a graph and applying three-layer GCNs with global mean pooling, the approach enables coordinated decisions under limited information sharing while maintaining decentralized execution. It introduces a regularized Reg-P-GCN-MAPPO variant that injects Gaussian noise into the value function to improve exploration and reduce overfitting, and validates the methods across four configurations, showing robust profits and scalable performance, especially in larger agent populations. The work advances practical, adaptive inventory management in decentralized, graph-structured environments and provides code for reproduction, highlighting the benefits of applying structure-aware MARL to real-world supply chains.
Abstract
Inventory control in modern supply chains has attracted significant attention due to the increasing number of disruptive shocks and the challenges posed by complex dynamics, uncertainties, and limited collaboration. Traditional methods, which often rely on static parameters, struggle to adapt to changing environments. This paper proposes a Multi-Agent Reinforcement Learning (MARL) framework with Graph Neural Networks (GNNs) for state representation to address these limitations. Our approach redefines the action space by parameterizing heuristic inventory control policies, making it adaptive as the parameters dynamically adjust based on system conditions. By leveraging the inherent graph structure of supply chains, our framework enables agents to learn the system's topology, and we employ a centralized learning, decentralized execution scheme that allows agents to learn collaboratively while overcoming information-sharing constraints. Additionally, we incorporate global mean pooling and regularization techniques to enhance performance. We test the capabilities of our proposed approach on four different supply chain configurations and conduct a sensitivity analysis. This work paves the way for utilizing MARL-GNN frameworks to improve inventory management in complex, decentralized supply chain environments.
