Real-time Animation Generation and Control on Rigged Models via Large Language Models
Han Huang, Fernanda De La Torre, Cathy Mengying Fang, Andrzej Banburski-Fahey, Judith Amores, Jaron Lanier
TL;DR
This paper tackles real-time, language-driven animation for rigged 3D characters by embedding a large language model in Unity to emit structured motion strings that encode joint rotations and root translations. It introduces metaprompt designs to ensure syntactic validity, and explores few-shot and zero-shot generation alongside an animation-control pathway via Unity's AnimationManager. The approach enables prompt-based transitions and broad robustness across diverse rigs, validated through qualitative demonstrations on multiple motions. The work offers a flexible framework for natural-language-driven animation prototyping and state management with potential impact on rapid animation design and iteration.
Abstract
We introduce a novel method for real-time animation control and generation on rigged models using natural language input. First, we embed a large language model (LLM) in Unity to output structured texts that can be parsed into diverse and realistic animations. Second, we illustrate LLM's potential to enable flexible state transition between existing animations. We showcase the robustness of our approach through qualitative results on various rigged models and motions.
