Table of Contents
Fetching ...

High-quality Animatable Eyelid Shapes from Lightweight Captures

Junfeng Lyu, Feng Xu

TL;DR

The paper addresses the challenge of producing high-quality eyelid reconstruction and animation from lightweight RGB video. It introduces an eyeball-calibrated dynamic neural SDF framework with a gaze-dependent adaptive anchor grid and an eyelid control module to enable semantic, gaze-driven animation. Experimental results on synthetic and real data show improved geometric fidelity and more realistic eyelid motion compared to baselines with similar capture setups. This work reduces capture costs and broadens the applicability of realistic digital humans in interactive media and AR/VR contexts.

Abstract

High-quality eyelid reconstruction and animation are challenging for the subtle details and complicated deformations. Previous works usually suffer from the trade-off between the capture costs and the quality of details. In this paper, we propose a novel method that can achieve detailed eyelid reconstruction and animation by only using an RGB video captured by a mobile phone. Our method utilizes both static and dynamic information of eyeballs (e.g., positions and rotations) to assist the eyelid reconstruction, cooperating with an automatic eyeball calibration method to get the required eyeball parameters. Furthermore, we develop a neural eyelid control module to achieve the semantic animation control of eyelids. To the best of our knowledge, we present the first method for high-quality eyelid reconstruction and animation from lightweight captures. Extensive experiments on both synthetic and real data show that our method can provide more detailed and realistic results compared with previous methods based on the same-level capture setups. The code is available at https://github.com/StoryMY/AniEyelid.

High-quality Animatable Eyelid Shapes from Lightweight Captures

TL;DR

The paper addresses the challenge of producing high-quality eyelid reconstruction and animation from lightweight RGB video. It introduces an eyeball-calibrated dynamic neural SDF framework with a gaze-dependent adaptive anchor grid and an eyelid control module to enable semantic, gaze-driven animation. Experimental results on synthetic and real data show improved geometric fidelity and more realistic eyelid motion compared to baselines with similar capture setups. This work reduces capture costs and broadens the applicability of realistic digital humans in interactive media and AR/VR contexts.

Abstract

High-quality eyelid reconstruction and animation are challenging for the subtle details and complicated deformations. Previous works usually suffer from the trade-off between the capture costs and the quality of details. In this paper, we propose a novel method that can achieve detailed eyelid reconstruction and animation by only using an RGB video captured by a mobile phone. Our method utilizes both static and dynamic information of eyeballs (e.g., positions and rotations) to assist the eyelid reconstruction, cooperating with an automatic eyeball calibration method to get the required eyeball parameters. Furthermore, we develop a neural eyelid control module to achieve the semantic animation control of eyelids. To the best of our knowledge, we present the first method for high-quality eyelid reconstruction and animation from lightweight captures. Extensive experiments on both synthetic and real data show that our method can provide more detailed and realistic results compared with previous methods based on the same-level capture setups. The code is available at https://github.com/StoryMY/AniEyelid.
Paper Structure (28 sections, 26 equations, 14 figures, 7 tables)

This paper contains 28 sections, 26 equations, 14 figures, 7 tables.

Figures (14)

  • Figure 1: The eyeball calibration is based on an 3D eyeball model with physiological prior. We apply differentiable rendering to optimize eyeball parameters by aligning iris masks.
  • Figure 2: An overview of our method. (A) Our method models the moving eyelids as a dynamic neural SDF field, which is achieved by a canonical hyper-space with deformation. For a sampled 3D point in the observation space, the topology and deformation network convert it to the canonical hyper-space, where the SDF field is defined. Then, its point color and SDF predicted by MLPs are used to apply volume rendering for training on RGB images. (B) This module divides the dynamic information into eye motions and others. The eye motions are modeled by the latent code mapped from eyeball rotations, and the other movements are modeled by a learnable latent code. (C) We encode the geometry feature of a 3D canonical point by the positions of neighbor anchors. For each frame, the anchor positions are determined by a learnable base grid plus the linear combination of two learnable offset grids based on eyeball rotations.
  • Figure 3: Topology coordinates model various shape templates of the eyelids.
  • Figure 4: Illustration of disentanglement strategy. We generate pseudo deformation codes by replacing one of the parts of the original deformation code with a random latent code $\boldsymbol{\varepsilon}$. The control region of the unchanged part is encouraged to have the same hyper-space coordinates.
  • Figure 5: The contact loss encourages the inner surface of eyelid tightly adhere to the eyeball surface, which provides additional geometric constraints. The used and non-used vertices are chosen by excluding all back vertices and visible frontal vertices.
  • ...and 9 more figures