Billiards Sports Analytics: Datasets and Tasks
Qianru Zhang, Zheng Wang, Cheng Long, Siu-Ming Yiu
TL;DR
This work tackles the lack of publicly available analytics data for billiards by introducing a large-scale 9-ball dataset capturing break-shot layouts, trajectories, and performance indicators from 94 tournaments. It advances three tailored tasks with dedicated models: BLCNN for break-shot layout prediction, BLGAN for generating high-quality break layouts, and BL2Vec for efficient similar-layout retrieval through a triplet CNN-based embedding. The methods achieve strong results across their respective tasks, with BLCNN delivering high prediction accuracy, BLGAN producing realistic layouts that users deem easy to clear, and BL2Vec providing substantial speedups and superior ranking performance over baselines (e.g., up to 420× faster). The dataset and code are publicly available, enabling broader research in spatial-layout understanding, tactical analysis, and fan-coherent retrieval in billiards and potentially other sports.
Abstract
Nowadays, it becomes a common practice to capture some data of sports games with devices such as GPS sensors and cameras and then use the data to perform various analyses on sports games, including tactics discovery, similar game retrieval, performance study, etc. While this practice has been conducted to many sports such as basketball and soccer, it remains largely unexplored on the billiards sports, which is mainly due to the lack of publicly available datasets. Motivated by this, we collect a dataset of billiards sports, which includes the layouts (i.e., locations) of billiards balls after performing break shots, called break shot layouts, the traces of the balls as a result of strikes (in the form of trajectories), and detailed statistics and performance indicators. We then study and develop techniques for three tasks on the collected dataset, including (1) prediction and (2) generation on the layouts data, and (3) similar billiards layout retrieval on the layouts data, which can serve different users such as coaches, players and fans. We conduct extensive experiments on the collected dataset and the results show that our methods perform effectively and efficiently.
