A Simple, Solid, and Reproducible Baseline for Bridge Bidding AI

Haruka Kita; Sotetsu Koyamada; Yotaro Yamaguchi; Shin Ishii

A Simple, Solid, and Reproducible Baseline for Bridge Bidding AI

Haruka Kita, Sotetsu Koyamada, Yotaro Yamaguchi, Shin Ishii

TL;DR

This work tackles the bidding phase of contract bridge by proposing a simple, reproducible baseline that combines supervised pretraining on SAYC-derived data with PPO-based reinforcement learning and fictitious self-play. The approach uses a 480-dim binary input and a 4-layer 1024-unit MLP to predict 38 bidding actions and estimate value, with end-of-game rewards tied to a DDS-derived score via $z = \text{score} / 7600$. Ablation studies show that supervised pretraining is essential for surpassing the WBridge5 benchmark, while fictitious self-play helps stabilize training. The method achieves a new performance level against WBridge5 (+1.24 IMPs/board) and is released as open-source to foster reproducibility, evaluation, and further advances in bridge AI.

Abstract

Contract bridge, a cooperative game characterized by imperfect information and multi-agent dynamics, poses significant challenges and serves as a critical benchmark in artificial intelligence (AI) research. Success in this domain requires agents to effectively cooperate with their partners. This study demonstrates that an appropriate combination of existing methods can perform surprisingly well in bridge bidding against WBridge5, a leading benchmark in the bridge bidding system and a multiple-time World Computer-Bridge Championship winner. Our approach is notably simple, yet it outperforms the current state-of-the-art methodologies in this field. Furthermore, we have made our code and models publicly available as open-source software. This initiative provides a strong starting foundation for future bridge AI research, facilitating the development and verification of new strategies and advancements in the field.

A Simple, Solid, and Reproducible Baseline for Bridge Bidding AI

TL;DR

. Ablation studies show that supervised pretraining is essential for surpassing the WBridge5 benchmark, while fictitious self-play helps stabilize training. The method achieves a new performance level against WBridge5 (+1.24 IMPs/board) and is released as open-source to foster reproducibility, evaluation, and further advances in bridge AI.

Abstract

Paper Structure (12 sections, 2 figures, 2 tables)

This paper contains 12 sections, 2 figures, 2 tables.

Introduction
Background: Contract Bridge Overview
Related Work
Methods: Training Recipe
Network Architecture and Input Features
Model Pretraining by Supervised Learning (SL)
Reinforcement Learning (RL)
Results
Performance against WBridge5
Ablation Study
Open-Source Software and Models
Limitations, Future Work, and Conclusion

Figures (2)

Figure 1: Ablation of each training component.
Figure 2: Comparison of usual self-play (w/o FSP) and FSP. Each item represents the IMPs/board scaled by tanh of the model at the X-steps against the model at the Y-steps (X is greater than Y).

A Simple, Solid, and Reproducible Baseline for Bridge Bidding AI

TL;DR

Abstract

A Simple, Solid, and Reproducible Baseline for Bridge Bidding AI

Authors

TL;DR

Abstract

Table of Contents

Figures (2)