Table of Contents
Fetching ...

ViSTAR: Virtual Skill Training with Augmented Reality with 3D Avatars and LLM coaching agent

Chunggi Lee, Hayato Saiki, Tica Lin, Eiji Ikeda, Kenji Suzuki, Chen Zhu-Tian, Hanspeter Pfister

TL;DR

ViSTAR, a Virtual Skill Training system in AR that supports self-guided basketball skill practice, with feedback on balance, posture, and timing, generates verbal feedback by analyzing spatio-temporal joint data and mapping features to natural-language coaching cues via a Large Language Model.

Abstract

We present ViSTAR, a Virtual Skill Training system in AR that supports self-guided basketball skill practice, with feedback on balance, posture, and timing. From a formative study with basketball players and coaches, the system addresses three challenges: understanding skills, identifying errors, and correcting mistakes. ViSTAR follows the Behavioral Skills Training (BST) framework-instruction, modeling, rehearsal, and feedback. It provides feedback through visual overlays, rhythm and timing cues, and an AI-powered coaching agent using 3D motion reconstruction. We generate verbal feedback by analyzing spatio-temporal joint data and mapping features to natural-language coaching cues via a Large Language Model (LLM). A key novelty is this feedback generation: motion features become concise coaching insights. In two studies (N=16), participants generally preferred our AI-generated feedback to coach feedback and reported that ViSTAR helped them notice posture and balance issues and refine movements beyond self-observation.

ViSTAR: Virtual Skill Training with Augmented Reality with 3D Avatars and LLM coaching agent

TL;DR

ViSTAR, a Virtual Skill Training system in AR that supports self-guided basketball skill practice, with feedback on balance, posture, and timing, generates verbal feedback by analyzing spatio-temporal joint data and mapping features to natural-language coaching cues via a Large Language Model.

Abstract

We present ViSTAR, a Virtual Skill Training system in AR that supports self-guided basketball skill practice, with feedback on balance, posture, and timing. From a formative study with basketball players and coaches, the system addresses three challenges: understanding skills, identifying errors, and correcting mistakes. ViSTAR follows the Behavioral Skills Training (BST) framework-instruction, modeling, rehearsal, and feedback. It provides feedback through visual overlays, rhythm and timing cues, and an AI-powered coaching agent using 3D motion reconstruction. We generate verbal feedback by analyzing spatio-temporal joint data and mapping features to natural-language coaching cues via a Large Language Model (LLM). A key novelty is this feedback generation: motion features become concise coaching insights. In two studies (N=16), participants generally preferred our AI-generated feedback to coach feedback and reported that ViSTAR helped them notice posture and balance issues and refine movements beyond self-observation.
Paper Structure (38 sections, 5 equations, 7 figures, 4 tables)

This paper contains 38 sections, 5 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Motion guidance strategies across BST stages. The top row represents Holistic Motion Guidance (e.g., Motion Trail, Moving Ghost, and Hit Judgement) that supports overall flow and coordination through path guidance. The bottom row shows Localized Motion Guidance (e.g., Skeleton, PoseMatching, Difference Sphere) focusing on joint-level feedback for fine-grained correction.
  • Figure 2: Overview of the system workflow. User motion is analyzed using pose estimation, DTW, and Random Forest, and the resulting analysis is used to generate motion guidances, which are visualized in Unity through multi-faceted feedback.
  • Figure 3: Verbal feedback generation pipeline. User motion is compared to ideal motion using DTW and a decision tree, and key misaligned joints are identified. An LLM generates feedback based on motion descriptors from the dataset.
  • Figure 4: Comparison between real and simulated motions, along with distribution visualizations.
  • Figure 5: Overall usability ratings of ViSTAR across seven dimensions (e.g., helpfulness, engagement, applicability), showing high user satisfaction. (b) Comparison of identifiability and correction ratings between the self-observation baseline and ViSTAR , with ViSTAR receiving consistently higher scores.
  • ...and 2 more figures