Generalizing Sports Feedback Generation by Watching Competitions and Reading Books: A Rock Climbing Case Study
Arushi Rai, Adriana Kovashka
TL;DR
This work tackles the challenge of generating actionable sports feedback from videos and addresses the generalization gap when finetuning on a single sport. It introduces a cross-domain approach that leverages abundant auxiliary data from the target domain—competition commentary and coaching texts—alongside limited source-domain feedback, using LLM-based refinement and precise localization to produce well-aligned, high-quality feedback. The authors also propose two evaluation metrics, specificity and actionability, grounded in motor learning theory, and validate them with human annotations. Empirically, incorporating auxiliary data yields strong improvements in out-of-distribution feedback generation and demonstrates the complementary value of text data for enhancing actionability. The approach offers a practical, scalable pathway to domain-adaptive sports feedback with interpretable evaluation metrics that go beyond traditional lexical baselines.
Abstract
While there is rapid progress in video-LLMs with advanced reasoning capabilities, prior work shows that these models struggle on the challenging task of sports feedback generation and require expensive and difficult-to-collect finetuning feedback data for each sport. This limitation is evident from the poor generalization to sports unseen during finetuning. Furthermore, traditional text generation evaluation metrics (e.g., BLEU-4, METEOR, ROUGE-L, BERTScore), originally developed for machine translation and summarization, fail to capture the unique aspects of sports feedback quality. To address the first problem, using rock climbing as our case study, we propose using auxiliary freely-available web data from the target domain, such as competition videos and coaching manuals, in addition to existing sports feedback from a disjoint, source domain to improve sports feedback generation performance on the target domain. To improve evaluation, we propose two evaluation metrics: (1) specificity and (2) actionability. Together, our approach enables more meaningful and practical generation of sports feedback under limited annotations.
