Table of Contents
Fetching ...

CueTip: An Interactive and Explainable Physics-aware Pool Assistant

Sean Memery, Kevin Denamganai, Jiaxin Zhang, Zehai Tu, Yiwen Guo, Kartic Subr

TL;DR

CueTip tackles the challenge of providing physics-aware, explainable coaching for pool via an interactive natural-language interface. It couples a physics simulator with a language model, producing event-based traces that guide shot suggestions and grounded explanations, while decoupling tactical decisions through a neural surrogate that can mimic various agents. The system incorporates uncertainty and strategy/difficulty awareness, enabling contextual, user-guided queries with explanations anchored in domain expert rules. Empirical results show competitive performance, reliable explanations grounded in rules, and favorable user perceptions, especially among expert players, demonstrating the value of combining physics grounding with interpretable language-based coaching. The work also introduces a shareable 3Pool environment to benchmark explainable physics-aware agents.

Abstract

We present an interactive and explainable automated coaching assistant called CueTip for a variant of pool/billiards. CueTip's novelty lies in its combination of three features: a natural-language interface, an ability to perform contextual, physics-aware reasoning, and that its explanations are rooted in a set of predetermined guidelines developed by domain experts. We instrument a physics simulator so that it generates event traces in natural language alongside traditional state traces. Event traces lend themselves to interpretation by language models, which serve as the interface to our assistant. We design and train a neural adaptor that decouples tactical choices made by CueTip from its interactivity and explainability allowing it to be reconfigured to mimic any pool playing agent. Our experiments show that CueTip enables contextual query-based assistance and explanations while maintaining the strength of the agent in terms of win rate (improving it in some situations). The explanations generated by CueTip are physically-aware and grounded in the expert rules and are therefore more reliable.

CueTip: An Interactive and Explainable Physics-aware Pool Assistant

TL;DR

CueTip tackles the challenge of providing physics-aware, explainable coaching for pool via an interactive natural-language interface. It couples a physics simulator with a language model, producing event-based traces that guide shot suggestions and grounded explanations, while decoupling tactical decisions through a neural surrogate that can mimic various agents. The system incorporates uncertainty and strategy/difficulty awareness, enabling contextual, user-guided queries with explanations anchored in domain expert rules. Empirical results show competitive performance, reliable explanations grounded in rules, and favorable user perceptions, especially among expert players, demonstrating the value of combining physics grounding with interpretable language-based coaching. The work also introduces a shareable 3Pool environment to benchmark explainable physics-aware agents.

Abstract

We present an interactive and explainable automated coaching assistant called CueTip for a variant of pool/billiards. CueTip's novelty lies in its combination of three features: a natural-language interface, an ability to perform contextual, physics-aware reasoning, and that its explanations are rooted in a set of predetermined guidelines developed by domain experts. We instrument a physics simulator so that it generates event traces in natural language alongside traditional state traces. Event traces lend themselves to interpretation by language models, which serve as the interface to our assistant. We design and train a neural adaptor that decouples tactical choices made by CueTip from its interactivity and explainability allowing it to be reconfigured to mimic any pool playing agent. Our experiments show that CueTip enables contextual query-based assistance and explanations while maintaining the strength of the agent in terms of win rate (improving it in some situations). The explanations generated by CueTip are physically-aware and grounded in the expert rules and are therefore more reliable.

Paper Structure

This paper contains 48 sections, 12 figures, 1 table, 1 algorithm.

Figures (12)

  • Figure 1: A standard pool agent (a) makes a decision on which shot $\theta$ to play for a given state $\mathbf x$ of the table. We propose an interactive assistant (b) that contains an embodiment of an agent, but in addition is able to tune the choice of shot to user input $Q$. Our assistant produces $\theta$ and an explanation of this choice $\theta$ that is grounded in some set of rules defined by domain experts. An overview of CueTip and its interactions with simulator and language model are shown in (c). Instead of using the agent directly within CueTip, we train a multilayer perceptron network surrogate (d) to mimic any given agent statistically.
  • Figure 2: Left: Distributions of Likert scale distances between ground truth and estimations from different methods for each domain expert rule (mean$\pm$std. error) ; Right: Aggregated results over all domain expert rules (mean$\pm$std. error) ; Results obtained using the LM Llama-3.1-70B-Instructllama3modelcardllama-3-70B.
  • Figure 3: Distributions of ratings given by users in our survey, where each user self-reported their expertise. Users who self-reported as having high experience (bottom) rated CueTip's explanations higher than the baseline.
  • Figure 4: The figure shows an initial state (center) where player 1 needs to choose a shot. The player interacts with CueTip prompting it with the state of the table and a query. Four example queries are shown in different quadrants. CueTip returns shot parameters (illustrated with a red 'x' for point of contact, a brown line for angle from vertical and a green bar on the right for power) and an explanation. Also shown are rules from the expert guidelines (detailed in Appendix A) and whether they apply positively (green) or negatively (red).
  • Figure 5: Left: Distributions of Likert scale distances between ground truth and estimations from different methods for each domain expert rule (mean$\pm$std. error) ; Right: Aggregated results over all domain expert rules (mean$\pm$std. error) ; Results obtained using the language model Llama-3.2-3B-Instructllama3modelcardllama-3-3B.
  • ...and 7 more figures