Table of Contents
Fetching ...

Refined Inverse Rigging: A Balanced Approach to High-fidelity Blendshape Animation

Stevo Racković, Cláudia Soares, Dušan Jakovetić

TL;DR

The paper tackles inverse rigging for blendshape animation, aiming for high-fidelity mesh reconstruction with sparse, temporally coherent weights. It proposes a sequence-wide optimization that integrates non-linear quartic corrective terms with $l_1$ sparsity and a temporal roughness penalty, solved via coordinate descent and augmented by a clustering-based parallelization for efficiency. The core contributions include a matrix-based formulation over $T$ frames, a data-fidelity term $E_{df}$, sparsity $E_{sr}$, and temporal-smoothness $E_{tsr}$, plus clustering methods RSJD and RSJD_A to enable distributed optimization. Empirical results on Metahuman Jesse show improved mesh fidelity and smoother transitions, while clustering reduces computation time, making the approach viable for high-quality and potentially real-time facial animation applications.

Abstract

In this paper, we present an advanced approach to solving the inverse rig problem in blendshape animation, using high-quality corrective blendshapes. Our algorithm introduces novel enhancements in three key areas: ensuring high data fidelity in reconstructed meshes, achieving greater sparsity in weight distributions, and facilitating smoother frame-to-frame transitions. While the incorporation of corrective terms is a known practice, our method differentiates itself by employing a unique combination of $l_1$ norm regularization for sparsity and a temporal smoothness constraint through roughness penalty, focusing on the sum of second differences in consecutive frame weights. A significant innovation in our approach is the temporal decoupling of blendshapes, which permits simultaneous optimization across entire animation sequences. This feature sets our work apart from existing methods and contributes to a more efficient and effective solution. Our algorithm exhibits a marked improvement in maintaining data fidelity and ensuring smooth frame transitions when compared to prior approaches that either lack smoothness regularization or rely solely on linear blendshape models. In addition to superior mesh resemblance and smoothness, our method offers practical benefits, including reduced computational complexity and execution time, achieved through a novel parallelization strategy using clustering methods. Our results not only advance the state of the art in terms of fidelity, sparsity, and smoothness in inverse rigging but also introduce significant efficiency improvements. The source code will be made available upon acceptance of the paper.

Refined Inverse Rigging: A Balanced Approach to High-fidelity Blendshape Animation

TL;DR

The paper tackles inverse rigging for blendshape animation, aiming for high-fidelity mesh reconstruction with sparse, temporally coherent weights. It proposes a sequence-wide optimization that integrates non-linear quartic corrective terms with sparsity and a temporal roughness penalty, solved via coordinate descent and augmented by a clustering-based parallelization for efficiency. The core contributions include a matrix-based formulation over frames, a data-fidelity term , sparsity , and temporal-smoothness , plus clustering methods RSJD and RSJD_A to enable distributed optimization. Empirical results on Metahuman Jesse show improved mesh fidelity and smoother transitions, while clustering reduces computation time, making the approach viable for high-quality and potentially real-time facial animation applications.

Abstract

In this paper, we present an advanced approach to solving the inverse rig problem in blendshape animation, using high-quality corrective blendshapes. Our algorithm introduces novel enhancements in three key areas: ensuring high data fidelity in reconstructed meshes, achieving greater sparsity in weight distributions, and facilitating smoother frame-to-frame transitions. While the incorporation of corrective terms is a known practice, our method differentiates itself by employing a unique combination of norm regularization for sparsity and a temporal smoothness constraint through roughness penalty, focusing on the sum of second differences in consecutive frame weights. A significant innovation in our approach is the temporal decoupling of blendshapes, which permits simultaneous optimization across entire animation sequences. This feature sets our work apart from existing methods and contributes to a more efficient and effective solution. Our algorithm exhibits a marked improvement in maintaining data fidelity and ensuring smooth frame transitions when compared to prior approaches that either lack smoothness regularization or rely solely on linear blendshape models. In addition to superior mesh resemblance and smoothness, our method offers practical benefits, including reduced computational complexity and execution time, achieved through a novel parallelization strategy using clustering methods. Our results not only advance the state of the art in terms of fidelity, sparsity, and smoothness in inverse rigging but also introduce significant efficiency improvements. The source code will be made available upon acceptance of the paper.
Paper Structure (18 sections, 15 equations, 6 figures, 2 tables)

This paper contains 18 sections, 15 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Demonstrating the Efficacy of Temporally Coherent Blendshape Animation. First row: Our Quartic Smooth method captures the intricate dynamics of facial expressions by leveraging a sophisticated blendshape rig, ensuring both high-fidelity mesh reconstruction and smooth temporal transitions in animation weights. Second row: The Linear Smooth method, as proposed by seo2011compression, prioritizes temporal smoothness but simplifies the blendshape model to a linear function, resulting in a trade-off with mesh accuracy. Third row: The Quartic approach from rackovic2023distributed achieves a high degree of mesh fidelity by utilizing a complex blendshape model but does not account for the smoothness of frame-to-frame transitions, leading to potential discontinuities. Displayed are selected weight trajectories over 100 animation frames, with two consecutive frames magnified to showcase the mesh results. The second column employs red shading to illustrate mesh error, and the third column uses yellow to highlight discrepancies between successive frames, underscoring the balance between accuracy and smoothness in animation sequences.
  • Figure 2: Top row: The trade-off between Reconstruction Error ($E_R$) and Density ($E_D$) (left), and Inter-Density ($E_{ID}$) (right), across different clustering approaches, with annotations indicating the chosen number of clusters ($K$). Bottom row: Visualization of clusters obtained using $RSJD$ with $K=29$ (left) and $RSJD_A$ with $K=13$ (right). In addition to the mesh clusters, a bipartite graph representation is shown, using the same color coding, where the left partition denotes mesh vertices, and the right partition signifies the blendshape indices assigned to each cluster.
  • Figure 3: Parametric Analysis of Animation Metrics During Training, comparing our method (Quartic Smooth) with benchmarks. Top row: This graph illustrates the interplay between blendshape cardinality and maximum mesh error under various animation approaches. The color coding denotes different levels of the sparsity regularizer, $\alpha$, while individual points represent a spectrum of the smoothness parameter, $\beta$, converging at $\beta=0$ indicated by the solid gray line. The horizontal dotted line marks the cardinality of the actual animation data used as the ground truth, whereas the vertical dotted line indicates the mesh error that would result if no blendshapes were activated (all weights at $0$). Bottom row: Here, we chart the roughness penalty corresponding to the varying values of $\beta$ along the x-axis. The color scheme is consistent with the top graph, linked to the $\alpha$ values. The horizontal dotted line represents the benchmark roughness penalty derived from the ground-truth animation data.
  • Figure 4: Comparative Analysis of Training Metrics Across Different Parameterizations and Methodologies. In this Figure we show how applying clustering on top of our approach affects the overall results.
  • Figure 5: Results over the test set with the selected hyperparameter values corresponding to Table \ref{['tab:test_parameters']}. The execution time for the clustered approach is presented in solid and shaded --- solid color indicates the execution time of the slowest cluster, as that is the cost when solving the problem in parallel, while shaded bar shows the time of solving the clusters sequentially.
  • ...and 1 more figures