Real-time Animation Generation and Control on Rigged Models via Large Language Models

Han Huang; Fernanda De La Torre; Cathy Mengying Fang; Andrzej Banburski-Fahey; Judith Amores; Jaron Lanier

Real-time Animation Generation and Control on Rigged Models via Large Language Models

Han Huang, Fernanda De La Torre, Cathy Mengying Fang, Andrzej Banburski-Fahey, Judith Amores, Jaron Lanier

TL;DR

This paper tackles real-time, language-driven animation for rigged 3D characters by embedding a large language model in Unity to emit structured motion strings that encode joint rotations and root translations. It introduces metaprompt designs to ensure syntactic validity, and explores few-shot and zero-shot generation alongside an animation-control pathway via Unity's AnimationManager. The approach enables prompt-based transitions and broad robustness across diverse rigs, validated through qualitative demonstrations on multiple motions. The work offers a flexible framework for natural-language-driven animation prototyping and state management with potential impact on rapid animation design and iteration.

Abstract

We introduce a novel method for real-time animation control and generation on rigged models using natural language input. First, we embed a large language model (LLM) in Unity to output structured texts that can be parsed into diverse and realistic animations. Second, we illustrate LLM's potential to enable flexible state transition between existing animations. We showcase the robustness of our approach through qualitative results on various rigged models and motions.

Real-time Animation Generation and Control on Rigged Models via Large Language Models

TL;DR

Abstract

Paper Structure (10 sections, 2 figures)

This paper contains 10 sections, 2 figures.

Animation Generation
Motion on a Rigged Model
LLM Metaprompt
Few-shot Generation
Zero-shot Generation
Animation Control
LLM Metaprompt
Generated Script
Comparison to Video Generation Approaches
Future Work

Figures (2)

Figure 1: Few-shot demonstrations provided for the rigged models. These are created by human animators and used as in-context learning examples for the LLM generator. Text bubbles contain their descriptions in the metaprompt.
Figure 2: Zero-shot animation generation on rigged models. Text bubbles contain the prompts used.

Real-time Animation Generation and Control on Rigged Models via Large Language Models

TL;DR

Abstract

Real-time Animation Generation and Control on Rigged Models via Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (2)