A Simple, Solid, and Reproducible Baseline for Bridge Bidding AI
Haruka Kita, Sotetsu Koyamada, Yotaro Yamaguchi, Shin Ishii
TL;DR
This work tackles the bidding phase of contract bridge by proposing a simple, reproducible baseline that combines supervised pretraining on SAYC-derived data with PPO-based reinforcement learning and fictitious self-play. The approach uses a 480-dim binary input and a 4-layer 1024-unit MLP to predict 38 bidding actions and estimate value, with end-of-game rewards tied to a DDS-derived score via $z = \text{score} / 7600$. Ablation studies show that supervised pretraining is essential for surpassing the WBridge5 benchmark, while fictitious self-play helps stabilize training. The method achieves a new performance level against WBridge5 (+1.24 IMPs/board) and is released as open-source to foster reproducibility, evaluation, and further advances in bridge AI.
Abstract
Contract bridge, a cooperative game characterized by imperfect information and multi-agent dynamics, poses significant challenges and serves as a critical benchmark in artificial intelligence (AI) research. Success in this domain requires agents to effectively cooperate with their partners. This study demonstrates that an appropriate combination of existing methods can perform surprisingly well in bridge bidding against WBridge5, a leading benchmark in the bridge bidding system and a multiple-time World Computer-Bridge Championship winner. Our approach is notably simple, yet it outperforms the current state-of-the-art methodologies in this field. Furthermore, we have made our code and models publicly available as open-source software. This initiative provides a strong starting foundation for future bridge AI research, facilitating the development and verification of new strategies and advancements in the field.
