Table of Contents
Fetching ...

Chess Rating Estimation from Moves and Clock Times Using a CNN-LSTM

Michael Omori, Prasad Tadepalli

TL;DR

This work tackles the problem of dynamic, move-by-move chess rating estimation by learning directly from game moves and clock times. It introduces RatingNet, a CNN-LSTM architecture that processes board-state representations and per-move time usage to output ratings after each move, trained on a large Lichess dataset and validated on a separate puzzle-difficulty dataset. The key findings show an average MAE of about $1.82\times 10^2$ across time controls, with clock-time features providing notable gains, especially in faster time controls, and the approach generalizes to puzzle-rating prediction. The study demonstrates the feasibility and potential of online, granular rating estimation for real-time matchmaking, anomaly detection, and broader applications beyond standard rating updates.

Abstract

Current chess rating systems update ratings incrementally and may not always accurately reflect a player's true strength at all times, especially for rapidly improving players or very rusty players. To overcome this, we explore a method to estimate player ratings directly from game moves and clock times. We compiled a benchmark dataset from Lichess with over one million games, encompassing various time controls and including move sequences and clock times. Our model architecture comprises a CNN to learn positional features, which are then integrated with clock-time data into a Bidirectional LSTM, predicting player ratings after each move. The model achieved an MAE of 182 rating points on the test data. Additionally, we applied our model to the 2024 IEEE Big Data Cup Chess Puzzle Difficulty Competition dataset, predicted puzzle ratings and achieved competitive results. This model is the first to use no hand-crafted features to estimate chess ratings and also the first to output a rating prediction after each move. Our method highlights the potential of using move-based rating estimation for enhancing rating systems and potentially other applications such as cheating detection.

Chess Rating Estimation from Moves and Clock Times Using a CNN-LSTM

TL;DR

This work tackles the problem of dynamic, move-by-move chess rating estimation by learning directly from game moves and clock times. It introduces RatingNet, a CNN-LSTM architecture that processes board-state representations and per-move time usage to output ratings after each move, trained on a large Lichess dataset and validated on a separate puzzle-difficulty dataset. The key findings show an average MAE of about across time controls, with clock-time features providing notable gains, especially in faster time controls, and the approach generalizes to puzzle-rating prediction. The study demonstrates the feasibility and potential of online, granular rating estimation for real-time matchmaking, anomaly detection, and broader applications beyond standard rating updates.

Abstract

Current chess rating systems update ratings incrementally and may not always accurately reflect a player's true strength at all times, especially for rapidly improving players or very rusty players. To overcome this, we explore a method to estimate player ratings directly from game moves and clock times. We compiled a benchmark dataset from Lichess with over one million games, encompassing various time controls and including move sequences and clock times. Our model architecture comprises a CNN to learn positional features, which are then integrated with clock-time data into a Bidirectional LSTM, predicting player ratings after each move. The model achieved an MAE of 182 rating points on the test data. Additionally, we applied our model to the 2024 IEEE Big Data Cup Chess Puzzle Difficulty Competition dataset, predicted puzzle ratings and achieved competitive results. This model is the first to use no hand-crafted features to estimate chess ratings and also the first to output a rating prediction after each move. Our method highlights the potential of using move-based rating estimation for enhancing rating systems and potentially other applications such as cheating detection.
Paper Structure (19 sections, 7 equations, 2 figures, 2 tables)

This paper contains 19 sections, 7 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: The model architecture used to predict chess ratings after each move using a CNN and LSTM taking in the remaining clock time feature. This shows the first two time steps of a sample game.
  • Figure 2: White plays Nf5, a blunder because black can take it with the bishop on d3. White's estimated rating goes from 1255 to 1241. Their actual Lichess rating in this bullet game is 1224.