Sprite Sheet Diffusion: Generate Game Character for Animation
Cheng-An Hsieh, Jing Zhang, Ava Yan
TL;DR
Sprite Sheet Diffusion tackles the labor-intensive task of generating consistent, pose-conditioned sprite sheets for 2D games by formalizing the problem as a conditional sequence generation: $f:(C,P)\to\hat{I}$ with $C$ as the reference appearance and $P=\{p_i\}_{i=1}^n$ as the pose sequence, producing $\hat{I}=\{\hat{i}_i\}_{i=1}^n$. The authors build a two-stage diffusion-based framework that extends Animate Anyone with a ReferenceNet for appearance, a Pose Guider for pose integration, and a Motion Module for temporal coherence, trained first for Pose-to-Image and then for Pose-to-Sprite. A high-quality sprite-centric dataset (150+ paired references with pose and action sequences) enables rigorous in-sample and out-of-sample evaluation. Quantitative and qualitative results show clear gains over baselines such as SD-IPCN and vanilla Animate Anyone, with ablations highlighting the Pose Guider’s critical role and the trade-offs in Stage 2 training. The work reduces manual workload in game development and opens pathways for broader applications in virtual avatars, storytelling, and education, while identifying challenges in fine-grained detail fidelity and overfitting during temporal training.
Abstract
In the game development process, creating character animations is a vital step that involves several stages. Typically for 2D games, illustrators begin by designing the main character image, which serves as the foundation for all subsequent animations. To create a smooth motion sequence, these subsequent animations involve drawing the character in different poses and actions, such as running, jumping, or attacking. This process requires significant manual effort from illustrators, as they must meticulously ensure consistency in design, proportions, and style across multiple motion frames. Each frame is drawn individually, making this a time-consuming and labor-intensive task. Generative models, such as diffusion models, have the potential to revolutionize this process by automating the creation of sprite sheets. Diffusion models, known for their ability to generate diverse images, can be adapted to create character animations. By leveraging the capabilities of diffusion models, we can significantly reduce the manual workload for illustrators, accelerate the animation creation process, and open up new creative possibilities in game development.
