Table of Contents
Fetching ...

Real-time Animation Generation and Control on Rigged Models via Large Language Models

Han Huang, Fernanda De La Torre, Cathy Mengying Fang, Andrzej Banburski-Fahey, Judith Amores, Jaron Lanier

TL;DR

This paper tackles real-time, language-driven animation for rigged 3D characters by embedding a large language model in Unity to emit structured motion strings that encode joint rotations and root translations. It introduces metaprompt designs to ensure syntactic validity, and explores few-shot and zero-shot generation alongside an animation-control pathway via Unity's AnimationManager. The approach enables prompt-based transitions and broad robustness across diverse rigs, validated through qualitative demonstrations on multiple motions. The work offers a flexible framework for natural-language-driven animation prototyping and state management with potential impact on rapid animation design and iteration.

Abstract

We introduce a novel method for real-time animation control and generation on rigged models using natural language input. First, we embed a large language model (LLM) in Unity to output structured texts that can be parsed into diverse and realistic animations. Second, we illustrate LLM's potential to enable flexible state transition between existing animations. We showcase the robustness of our approach through qualitative results on various rigged models and motions.

Real-time Animation Generation and Control on Rigged Models via Large Language Models

TL;DR

This paper tackles real-time, language-driven animation for rigged 3D characters by embedding a large language model in Unity to emit structured motion strings that encode joint rotations and root translations. It introduces metaprompt designs to ensure syntactic validity, and explores few-shot and zero-shot generation alongside an animation-control pathway via Unity's AnimationManager. The approach enables prompt-based transitions and broad robustness across diverse rigs, validated through qualitative demonstrations on multiple motions. The work offers a flexible framework for natural-language-driven animation prototyping and state management with potential impact on rapid animation design and iteration.

Abstract

We introduce a novel method for real-time animation control and generation on rigged models using natural language input. First, we embed a large language model (LLM) in Unity to output structured texts that can be parsed into diverse and realistic animations. Second, we illustrate LLM's potential to enable flexible state transition between existing animations. We showcase the robustness of our approach through qualitative results on various rigged models and motions.
Paper Structure (10 sections, 2 figures)

This paper contains 10 sections, 2 figures.

Figures (2)

  • Figure 1: Few-shot demonstrations provided for the rigged models. These are created by human animators and used as in-context learning examples for the LLM generator. Text bubbles contain their descriptions in the metaprompt.
  • Figure 2: Zero-shot animation generation on rigged models. Text bubbles contain the prompts used.