Chess Rating Estimation from Moves and Clock Times Using a CNN-LSTM
Michael Omori, Prasad Tadepalli
TL;DR
This work tackles the problem of dynamic, move-by-move chess rating estimation by learning directly from game moves and clock times. It introduces RatingNet, a CNN-LSTM architecture that processes board-state representations and per-move time usage to output ratings after each move, trained on a large Lichess dataset and validated on a separate puzzle-difficulty dataset. The key findings show an average MAE of about $1.82\times 10^2$ across time controls, with clock-time features providing notable gains, especially in faster time controls, and the approach generalizes to puzzle-rating prediction. The study demonstrates the feasibility and potential of online, granular rating estimation for real-time matchmaking, anomaly detection, and broader applications beyond standard rating updates.
Abstract
Current chess rating systems update ratings incrementally and may not always accurately reflect a player's true strength at all times, especially for rapidly improving players or very rusty players. To overcome this, we explore a method to estimate player ratings directly from game moves and clock times. We compiled a benchmark dataset from Lichess with over one million games, encompassing various time controls and including move sequences and clock times. Our model architecture comprises a CNN to learn positional features, which are then integrated with clock-time data into a Bidirectional LSTM, predicting player ratings after each move. The model achieved an MAE of 182 rating points on the test data. Additionally, we applied our model to the 2024 IEEE Big Data Cup Chess Puzzle Difficulty Competition dataset, predicted puzzle ratings and achieved competitive results. This model is the first to use no hand-crafted features to estimate chess ratings and also the first to output a rating prediction after each move. Our method highlights the potential of using move-based rating estimation for enhancing rating systems and potentially other applications such as cheating detection.
