Mastering the Game of Guandan with Deep Reinforcement Learning and Behavior Regulating

Yifan Yanggong; Hao Pan; Lei Wang

Mastering the Game of Guandan with Deep Reinforcement Learning and Behavior Regulating

Yifan Yanggong, Hao Pan, Lei Wang

TL;DR

This work addresses mastering Guandan, a challenging imperfect-information card game, by introducing GuanZero, a Deep Monte-Carlo framework augmented with a novel state-action encoding that regulates cooperative behaviors. The approach combines distributed learning, LSTM-based history encoding, and a six-layer feedforward network to estimate $Q(s,a)$ and guide decision-making. Empirical results show GuanZero outperforms random and rule-based baselines and benefits notably from explicit behavior regulation (cooperating, dwarfing, assisting), with training converging in under a week. The study advances AI in complex, multi-agent, imperfect-information domains and points to future enhancements in behavior automation and tribute-strategy learning.

Abstract

Games are a simplified model of reality and often serve as a favored platform for Artificial Intelligence (AI) research. Much of the research is concerned with game-playing agents and their decision making processes. The game of Guandan (literally, "throwing eggs") is a challenging game where even professional human players struggle to make the right decision at times. In this paper we propose a framework named GuanZero for AI agents to master this game using Monte-Carlo methods and deep neural networks. The main contribution of this paper is about regulating agents' behavior through a carefully designed neural network encoding scheme. We then demonstrate the effectiveness of the proposed framework by comparing it with state-of-the-art approaches.

Mastering the Game of Guandan with Deep Reinforcement Learning and Behavior Regulating

TL;DR

and guide decision-making. Empirical results show GuanZero outperforms random and rule-based baselines and benefits notably from explicit behavior regulation (cooperating, dwarfing, assisting), with training converging in under a week. The study advances AI in complex, multi-agent, imperfect-information domains and points to future enhancements in behavior automation and tribute-strategy learning.

Abstract

Paper Structure (20 sections, 9 figures, 10 tables)

This paper contains 20 sections, 9 figures, 10 tables.

Introduction
Background of Guandan
Methodology
Monte-Carlo Methods and Deep Neural Networks
State Representation
Cooperating
Dwarfing
Assisting
Neural Network Architecture
Distributed Learning
Experiments
Opposing Guandan Agents
Behavior Regulating
Cooperating
Dwarfing
...and 5 more sections

Figures (9)

Figure 1: Indexing scheme of cards
Figure 2: State representation of cards
Figure 3: Network architecture of GuanZero
Figure 4: The distributed learning process of GuanZero
Figure 5: History of WR achieved by GuanZero agents playing against DouZero-based ones
...and 4 more figures

Mastering the Game of Guandan with Deep Reinforcement Learning and Behavior Regulating

TL;DR

Abstract

Mastering the Game of Guandan with Deep Reinforcement Learning and Behavior Regulating

Authors

TL;DR

Abstract

Table of Contents

Figures (9)