Table of Contents
Fetching ...

Fast Sprite Decomposition from Animated Graphics

Tomoyuki Suzuki, Kotaro Kikuchi, Kota Yamaguchi

TL;DR

The paper addresses the problem of decomposing raster animated graphics into a compact sprite-based representation with static textures and per-frame affine/opacity parameters to enable interactive editing. It introduces a prior-based optimization framework, augmented by segmentation-based initialization from single-frame user annotations and an efficient rendering model, and validates it on the Crello Animation dataset. The method demonstrates improved quality/efficiency trade-offs over baselines like Layered Neural Atlases and Deformable Sprites, enabling practical editing applications such as texture replacement, sprite removal, and added rotations. This work advances editing workflows for animated graphics and provides a realistic benchmark for sprite-based decomposition in design-oriented video content.

Abstract

This paper presents an approach to decomposing animated graphics into sprites, a set of basic elements or layers. Our approach builds on the optimization of sprite parameters to fit the raster video. For efficiency, we assume static textures for sprites to reduce the search space while preventing artifacts using a texture prior model. To further speed up the optimization, we introduce the initialization of the sprite parameters utilizing a pre-trained video object segmentation model and user input of single frame annotations. For our study, we construct the Crello Animation dataset from an online design service and define quantitative metrics to measure the quality of the extracted sprites. Experiments show that our method significantly outperforms baselines for similar decomposition tasks in terms of the quality/efficiency tradeoff.

Fast Sprite Decomposition from Animated Graphics

TL;DR

The paper addresses the problem of decomposing raster animated graphics into a compact sprite-based representation with static textures and per-frame affine/opacity parameters to enable interactive editing. It introduces a prior-based optimization framework, augmented by segmentation-based initialization from single-frame user annotations and an efficient rendering model, and validates it on the Crello Animation dataset. The method demonstrates improved quality/efficiency trade-offs over baselines like Layered Neural Atlases and Deformable Sprites, enabling practical editing applications such as texture replacement, sprite removal, and added rotations. This work advances editing workflows for animated graphics and provides a realistic benchmark for sprite-based decomposition in design-oriented video content.

Abstract

This paper presents an approach to decomposing animated graphics into sprites, a set of basic elements or layers. Our approach builds on the optimization of sprite parameters to fit the raster video. For efficiency, we assume static textures for sprites to reduce the search space while preventing artifacts using a texture prior model. To further speed up the optimization, we introduce the initialization of the sprite parameters utilizing a pre-trained video object segmentation model and user input of single frame annotations. For our study, we construct the Crello Animation dataset from an online design service and define quantitative metrics to measure the quality of the extracted sprites. Experiments show that our method significantly outperforms baselines for similar decomposition tasks in terms of the quality/efficiency tradeoff.
Paper Structure (26 sections, 7 equations, 16 figures, 4 tables)

This paper contains 26 sections, 7 equations, 16 figures, 4 tables.

Figures (16)

  • Figure 1: Sprite decomposition from animated graphics. Given a raster video and auxiliary bounding box annotations, our method decomposes sprites that consist of static textures and animation parameters. The decomposed parameters are easily applicable to various video-editing applications.
  • Figure 2: Comparison of sprite representations of Layered Neural Atlases lna, Deformable Sprites ds, and ours. Our approach limits parameter space to static texture and affine transformation, which enables faster convergence while keeping the necessary representation for animated graphics.
  • Figure 3: Our decomposition pipeline. Given a raster video and bounding box annotation for a single frame, we first apply a video object segmentation model to initialize texture and animation parameters. Then, we apply a gradient-based optimizer to find the optimal texture codes, animation parameters, and the texture prior parameters.
  • Figure 4: Comparison of the trade-off between the quality and optimization time on the test split. The solid lines show the average of the samples with four or fewer layers, and the dashed lines show the average of the samples with five and six layers.
  • Figure 5: Qualitative comparison between LNA lna, DS ds, and our method. We put the description of the animation above each sprite.
  • ...and 11 more figures