Table of Contents
Fetching ...

Deep Spectral Meshes: Multi-Frequency Facial Mesh Processing with Graph Neural Networks

Robert Kosk, Richard Southern, Lihua You, Shaojun Bian, Willem Kokke, Greg Maguire

TL;DR

This work presents Deep Spectral Meshes, a frequency-aware framework for 3D facial mesh synthesis that decomposes deformations into low- and high-frequency components using spectral mesh processing. Low-frequency deformations are represented in standardised Euclidean coordinates, while high-frequency details use a normalised deformation representation, enabling independent editing via a two-branch variational graph autoencoder. A Conditioning Factor is introduced to balance plausibility and artistic control, enabling disentangled high- and low-frequency editing and joint generation of plausible meshes. Across mesh reconstruction, interpolation, and multi-frequency editing tasks, the method improves perceptual quality (DAME) and maintains competitive geometric accuracy (L1) on multiple datasets, indicating practical gains for industrial-grade digital humans.

Abstract

With the rising popularity of virtual worlds, the importance of data-driven parametric models of 3D meshes has grown rapidly. Numerous applications, such as computer vision, procedural generation, and mesh editing, vastly rely on these models. However, current approaches do not allow for independent editing of deformations at different frequency levels. They also do not benefit from representing deformations at different frequencies with dedicated representations, which would better expose their properties and improve the generated meshes' geometric and perceptual quality. In this work, spectral meshes are introduced as a method to decompose mesh deformations into low-frequency and high-frequency deformations. These features of low- and high-frequency deformations are used for representation learning with graph convolutional networks. A parametric model for 3D facial mesh synthesis is built upon the proposed framework, exposing user parameters that control disentangled high- and low-frequency deformations. Independent control of deformations at different frequencies and generation of plausible synthetic examples are mutually exclusive objectives. A Conditioning Factor is introduced to leverage these objectives. Our model takes further advantage of spectral partitioning by representing different frequency levels with disparate, more suitable representations. Low frequencies are represented with standardised Euclidean coordinates, and high frequencies with a normalised deformation representation (DR). This paper investigates applications of our proposed approach in mesh reconstruction, mesh interpolation, and multi-frequency editing. It is demonstrated that our method improves the overall quality of generated meshes on most datasets when considering both the $L_1$ norm and perceptual Dihedral Angle Mesh Error (DAME) metrics.

Deep Spectral Meshes: Multi-Frequency Facial Mesh Processing with Graph Neural Networks

TL;DR

This work presents Deep Spectral Meshes, a frequency-aware framework for 3D facial mesh synthesis that decomposes deformations into low- and high-frequency components using spectral mesh processing. Low-frequency deformations are represented in standardised Euclidean coordinates, while high-frequency details use a normalised deformation representation, enabling independent editing via a two-branch variational graph autoencoder. A Conditioning Factor is introduced to balance plausibility and artistic control, enabling disentangled high- and low-frequency editing and joint generation of plausible meshes. Across mesh reconstruction, interpolation, and multi-frequency editing tasks, the method improves perceptual quality (DAME) and maintains competitive geometric accuracy (L1) on multiple datasets, indicating practical gains for industrial-grade digital humans.

Abstract

With the rising popularity of virtual worlds, the importance of data-driven parametric models of 3D meshes has grown rapidly. Numerous applications, such as computer vision, procedural generation, and mesh editing, vastly rely on these models. However, current approaches do not allow for independent editing of deformations at different frequency levels. They also do not benefit from representing deformations at different frequencies with dedicated representations, which would better expose their properties and improve the generated meshes' geometric and perceptual quality. In this work, spectral meshes are introduced as a method to decompose mesh deformations into low-frequency and high-frequency deformations. These features of low- and high-frequency deformations are used for representation learning with graph convolutional networks. A parametric model for 3D facial mesh synthesis is built upon the proposed framework, exposing user parameters that control disentangled high- and low-frequency deformations. Independent control of deformations at different frequencies and generation of plausible synthetic examples are mutually exclusive objectives. A Conditioning Factor is introduced to leverage these objectives. Our model takes further advantage of spectral partitioning by representing different frequency levels with disparate, more suitable representations. Low frequencies are represented with standardised Euclidean coordinates, and high frequencies with a normalised deformation representation (DR). This paper investigates applications of our proposed approach in mesh reconstruction, mesh interpolation, and multi-frequency editing. It is demonstrated that our method improves the overall quality of generated meshes on most datasets when considering both the norm and perceptual Dihedral Angle Mesh Error (DAME) metrics.
Paper Structure (28 sections, 11 equations, 11 figures, 4 tables)

This paper contains 28 sections, 11 equations, 11 figures, 4 tables.

Figures (11)

  • Figure S1: Overview of our Deep Spectral Meshes graph neural network.
  • Figure S2: One-ring neighbourhood vertices and the angles used to calculate cotangent weights.
  • Figure S3: Cont.
  • Figure S4: Comparison of the reconstruction results with our method ($k=500$, $\gamma=1$, $\mathbf{Z}=64$) and with common representations used in other methods: Euclidean coordinates Cheng2019Hanocka2019MeshCNN:EdgeZhou2020FullyKernels, standardised Euclidean coordinates Bouritsas2019NeuralGenerationChen2021LearningModelsGao2021LearningRepresentationGong2019SpiralNet++:OperatorRanjan2018 and normalised deformation representation (DR) Jiang2019Wu2018. Across Facsimile and FaceWarehouse datasets, our method outperforms in reconstructing examples from the training set and favourably balances perceptual and geometric quality on the Pareto-front of optimal solutions. Our method underperforms on the FaceScape Yang2020 dataset because the benefit of using normalised DR representation for high-frequency information is minuscule compared to standardised Euclidean representation.
  • Figure S5: Qualitative comparison of the reconstruction results of training data with our method ($k=500$, $\gamma=1$) and with common representations used in other methods: Euclidean coordinates Cheng2019Hanocka2019MeshCNN:EdgeZhou2020FullyKernels, standardised Euclidean coordinates Bouritsas2019NeuralGenerationChen2021LearningModelsGao2021LearningRepresentationGong2019SpiralNet++:OperatorRanjan2018 and the normalised deformation representation (DR) Jiang2019Wu2018. The meshes generated by our method achieve superior results compared to other feature representations. Zooming into the digital version is recommended to see the surface artefacts on the results generated with Euclidean and standardised Euclidean representations.
  • ...and 6 more figures