BASKET: A Large-Scale Video Dataset for Fine-Grained Skill Estimation
Yulu Pan, Ce Zhang, Gedas Bertasius
TL;DR
Basket introduces a large-scale, long-form basketball video dataset for fine-grained skill estimation, comprising 4,477 hours and 32,232 players across 21 leagues and six seasons. The task requires predicting 20 fine-grained skill levels on a five-point scale from 8–10 minute player highlights, demanding long-range temporal understanding and implicit player identification. Comprehensive experiments show current state-of-the-art video models perform poorly (max around 28.5% accuracy) compared with human experts (up to 72%), with notable gaps in cross-season and cross-league generalization. The work provides extensive dataset details, ablations, and human studies, arguing that Basket enables development of truly long-range, fine-grained skill models and holds potential for fair scouting and personalized player development tools.
Abstract
We present BASKET, a large-scale basketball video dataset for fine-grained skill estimation. BASKET contains 4,477 hours of video capturing 32,232 basketball players from all over the world. Compared to prior skill estimation datasets, our dataset includes a massive number of skilled participants with unprecedented diversity in terms of gender, age, skill level, geographical location, etc. BASKET includes 20 fine-grained basketball skills, challenging modern video recognition models to capture the intricate nuances of player skill through in-depth video analysis. Given a long highlight video (8-10 minutes) of a particular player, the model needs to predict the skill level (e.g., excellent, good, average, fair, poor) for each of the 20 basketball skills. Our empirical analysis reveals that the current state-of-the-art video models struggle with this task, significantly lagging behind the human baseline. We believe that BASKET could be a useful resource for developing new video models with advanced long-range, fine-grained recognition capabilities. In addition, we hope that our dataset will be useful for domain-specific applications such as fair basketball scouting, personalized player development, and many others. Dataset and code are available at https://github.com/yulupan00/BASKET.
