The ProfessionAl Go annotation datasEt (PAGE)
Yifan Gao, Danni Zhang, Haoyue Li
TL;DR
PAGE delivers the first large-scale, extensively annotated dataset of professional Go games, combining 98,525 records with rich metadata and KataGo-derived in-game statistics to enable rigorous data-driven analysis. The authors demonstrate PAGE’s value through three downstream tasks: gender participation analysis, blunder prediction using CNN/Transformer architectures, and game outcome prediction with multiple ML models, achieving strong results (e.g., CatBoost 75.3% accuracy). They also discuss future directions in advanced statistics, behavior modeling, rating systems, and live commentary, highlighting PAGE's potential to catalyze research in game analytics and psychology. The work provides a practical, publicly available resource that bridges Go studies with broader data science and cognitive-science questions, supporting both methodological development and empirical studies of human decision-making in a high-skill domain.
Abstract
The game of Go has been highly under-researched due to the lack of game records and analysis tools. In recent years, the increasing number of professional competitions and the advent of AlphaZero-based algorithms provide an excellent opportunity for analyzing human Go games on a large scale. In this paper, we present the ProfessionAl Go annotation datasEt (PAGE), containing 98,525 games played by 2,007 professional players and spans over 70 years. The dataset includes rich AI analysis results for each move. Moreover, PAGE provides detailed metadata for every player and game after manual cleaning and labeling. Beyond the preliminary analysis of the dataset, we provide sample tasks that benefit from our dataset to demonstrate the potential application of PAGE in multiple research directions. To the best of our knowledge, PAGE is the first dataset with extensive annotation in the game of Go. This work is an extended version of [1] where we perform a more detailed description, analysis, and application.
