Compressed Skinning for Facial Blendshapes
Ladislav Kavan, John Doublestein, Martin Prazak, Matthew Cioffi, Doug Roble
TL;DR
This work tackles real-time on-device facial animation with large numbers of blendshapes by converting delta-blendshape data into a compressed linear blend skinning (LBS) form. It introduces a first-order optimization approach that combines the Adam optimizer with projection steps to enforce non-negativity, unity, and sparsity in both the weight matrix and transformation components, producing a sparse skinning decomposition called compressed skinning. Key contributions include an explicit transformation from blendshape weights to LBS transforms, a sparsity-enhanced optimization framework, and a PyTorch implementation enabling flexible loss functions and constraints; an HD setting demonstrates substantially improved detail. Results show comparable or better fitting accuracy than Dem Bones while offering substantial memory savings (5–7×) and run-time speedups (2–3×) on mobile hardware, with additional gains in detail when using higher-norm formulations. The approach enables scalable, on-device facial animation for rigs with hundreds of blendshapes, albeit with longer pre-processing and a black-box rig evaluation assumption that may be refined in future work by integrating learned rigs or weighted vertex importance.
Abstract
We present a new method to bake classical facial animation blendshapes into a fast linear blend skinning representation. Previous work explored skinning decomposition methods that approximate general animated meshes using a dense set of bone transformations; these optimizers typically alternate between optimizing for the bone transformations and the skinning weights.We depart from this alternating scheme and propose a new approach based on proximal algorithms, which effectively means adding a projection step to the popular Adam optimizer. This approach is very flexible and allows us to quickly experiment with various additional constraints and/or loss functions. Specifically, we depart from the classical skinning paradigms and restrict the transformation coefficients to contain only about 10% non-zeros, while achieving similar accuracy and visual quality as the state-of-the-art. The sparse storage enables our method to deliver significant savings in terms of both memory and run-time speed. We include a compact implementation of our new skinning decomposition method in PyTorch, which is easy to experiment with and modify to related problems.
