Locally Adaptive Neural 3D Morphable Models
Michail Tarasiou, Rolandos Alexandros Potamias, Eimear O'Sullivan, Stylianos Ploumpis, Stefanos Zafeiriou
TL;DR
LAMM addresses the challenge of fine-grained local control over dense 3D meshes by introducing Locally Adaptive Morphable Models (LAMM), an end-to-end autoencoder that uses region-based tokenization and per-region control networks to overwrite encoded geometry with sparse control-point displacements. The method achieves state-of-the-art disentanglement and reconstruction by avoiding latent-space partitioning and leveraging a global latent code plus region-specific processing, enabling dense, locally edited outputs with fast CPU inference on high-resolution meshes. Key contributions include a novel architecture with region tokens and displacement networks, a self-supervised training scheme that morphs from mean to target shapes across multiple layers, and editing primitives such as region swapping and sampling, demonstrated on 12k- and 72k-vertex head/hand datasets. The approach scales to large meshes with reduced memory requirements and enables practical, interactive mesh manipulation for applications in avatar creation, animation, and digital editors.
Abstract
We present the Locally Adaptive Morphable Model (LAMM), a highly flexible Auto-Encoder (AE) framework for learning to generate and manipulate 3D meshes. We train our architecture following a simple self-supervised training scheme in which input displacements over a set of sparse control vertices are used to overwrite the encoded geometry in order to transform one training sample into another. During inference, our model produces a dense output that adheres locally to the specified sparse geometry while maintaining the overall appearance of the encoded object. This approach results in state-of-the-art performance in both disentangling manipulated geometry and 3D mesh reconstruction. To the best of our knowledge LAMM is the first end-to-end framework that enables direct local control of 3D vertex geometry in a single forward pass. A very efficient computational graph allows our network to train with only a fraction of the memory required by previous methods and run faster during inference, generating 12k vertex meshes at $>$60fps on a single CPU thread. We further leverage local geometry control as a primitive for higher level editing operations and present a set of derivative capabilities such as swapping and sampling object parts. Code and pretrained models can be found at https://github.com/michaeltrs/LAMM.
