Table of Contents
Fetching ...

End-to-End Chess Recognition

Athanasios Masouris, Jan van Gemert

TL;DR

This work presents end-to-end chess recognition from a single image, addressing the limitations of traditional multi-stage pipelines by predicting piece configurations directly without intermediate annotations. It introduces ChessReD, the first real-world dataset with $10{,}800$ smartphone-captured images and FEN-based annotations, to evaluate end-to-end approaches under diverse viewing angles. Two end-to-end strategies are explored: a 64-square multi-label classification using a ResNeXt backbone and a relative object-detection approach based on DETR, the latter facing convergence challenges on ChessReD. On real data, the classification approach achieves $15.26\%$ board accuracy, roughly $7\times$ the prior state-of-the-art, underscoring the promise of end-to-end learning for real-world chess recognition while highlighting areas for improvement in detector-based methods and dataset diversity.

Abstract

Chess recognition is the task of extracting the chess piece configuration from a chessboard image. Current approaches use a pipeline of separate, independent, modules such as chessboard detection, square localization, and piece classification. Instead, we follow the deep learning philosophy and explore an end-to-end approach to directly predict the configuration from the image, thus avoiding the error accumulation of the sequential approaches and eliminating the need for intermediate annotations. Furthermore, we introduce a new dataset, Chess Recognition Dataset (ChessReD), that consists of 10,800 real photographs and their corresponding annotations. In contrast to existing datasets that are synthetically rendered and have only limited angles, ChessReD has photographs captured from various angles using smartphone cameras; a sensor choice made to ensure real-world applicability. Our approach in chess recognition on the introduced challenging benchmark dataset outperforms related approaches, successfully recognizing the chess pieces' configuration in 15.26% of ChessReD's test images. This accuracy may seem low, but it is ~7x better than the current state-of-the-art and reflects the difficulty of the problem. The code and data are available through: https://github.com/ThanosM97/end-to-end-chess-recognition.

End-to-End Chess Recognition

TL;DR

This work presents end-to-end chess recognition from a single image, addressing the limitations of traditional multi-stage pipelines by predicting piece configurations directly without intermediate annotations. It introduces ChessReD, the first real-world dataset with smartphone-captured images and FEN-based annotations, to evaluate end-to-end approaches under diverse viewing angles. Two end-to-end strategies are explored: a 64-square multi-label classification using a ResNeXt backbone and a relative object-detection approach based on DETR, the latter facing convergence challenges on ChessReD. On real data, the classification approach achieves board accuracy, roughly the prior state-of-the-art, underscoring the promise of end-to-end learning for real-world chess recognition while highlighting areas for improvement in detector-based methods and dataset diversity.

Abstract

Chess recognition is the task of extracting the chess piece configuration from a chessboard image. Current approaches use a pipeline of separate, independent, modules such as chessboard detection, square localization, and piece classification. Instead, we follow the deep learning philosophy and explore an end-to-end approach to directly predict the configuration from the image, thus avoiding the error accumulation of the sequential approaches and eliminating the need for intermediate annotations. Furthermore, we introduce a new dataset, Chess Recognition Dataset (ChessReD), that consists of 10,800 real photographs and their corresponding annotations. In contrast to existing datasets that are synthetically rendered and have only limited angles, ChessReD has photographs captured from various angles using smartphone cameras; a sensor choice made to ensure real-world applicability. Our approach in chess recognition on the introduced challenging benchmark dataset outperforms related approaches, successfully recognizing the chess pieces' configuration in 15.26% of ChessReD's test images. This accuracy may seem low, but it is ~7x better than the current state-of-the-art and reflects the difficulty of the problem. The code and data are available through: https://github.com/ThanosM97/end-to-end-chess-recognition.
Paper Structure (24 sections, 7 figures, 5 tables)

This paper contains 24 sections, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Chess recognition input image and output configuration.
  • Figure 2: Image samples from ChessReD.
  • Figure 3: Bounding box and corner point annotations in ChessReD2K.
  • Figure 4: Sample pair of images for the ablation study.
  • Figure 5: Early-game (less than 30 moves) samples from ChessReD.
  • ...and 2 more figures