Table of Contents
Fetching ...

Eye-See-You: Reverse Pass-Through VR and Head Avatars

Ankan Dash, Jingyi Gu, Guiling Wang, Chen Chen

TL;DR

The paper tackles the social isolation caused by VR headsets occluding eyes and facial expressions by introducing RevAvatar, an AI-driven framework for real-time reverse pass-through and one-shot 3D head avatars. It combines a fast 2D face restoration pipeline with a 3D avatar model based on tri-plane representations and 3DMM guidance, enabling outward display of gaze and expressions and immersive VR interactions. To accelerate development and generalization across devices, the authors release VR-Face, a 200k-sample VR-simulated dataset that captures occlusions, lighting variations, and distortions. Experimental results demonstrate real-time performance on mobile SoCs like the Apple M2 and competitive quality against state-of-the-art baselines, establishing RevAvatar and VR-Face as a new benchmark for AI-enabled VR social presence.

Abstract

Virtual Reality (VR) headsets, while integral to the evolving digital ecosystem, present a critical challenge: the occlusion of users' eyes and portions of their faces, which hinders visual communication and may contribute to social isolation. To address this, we introduce RevAvatar, an innovative framework that leverages AI methodologies to enable reverse pass-through technology, fundamentally transforming VR headset design and interaction paradigms. RevAvatar integrates state-of-the-art generative models and multimodal AI techniques to reconstruct high-fidelity 2D facial images and generate accurate 3D head avatars from partially observed eye and lower-face regions. This framework represents a significant advancement in AI4Tech by enabling seamless interaction between virtual and physical environments, fostering immersive experiences such as VR meetings and social engagements. Additionally, we present VR-Face, a novel dataset comprising 200,000 samples designed to emulate diverse VR-specific conditions, including occlusions, lighting variations, and distortions. By addressing fundamental limitations in current VR systems, RevAvatar exemplifies the transformative synergy between AI and next-generation technologies, offering a robust platform for enhancing human connection and interaction in virtual environments.

Eye-See-You: Reverse Pass-Through VR and Head Avatars

TL;DR

The paper tackles the social isolation caused by VR headsets occluding eyes and facial expressions by introducing RevAvatar, an AI-driven framework for real-time reverse pass-through and one-shot 3D head avatars. It combines a fast 2D face restoration pipeline with a 3D avatar model based on tri-plane representations and 3DMM guidance, enabling outward display of gaze and expressions and immersive VR interactions. To accelerate development and generalization across devices, the authors release VR-Face, a 200k-sample VR-simulated dataset that captures occlusions, lighting variations, and distortions. Experimental results demonstrate real-time performance on mobile SoCs like the Apple M2 and competitive quality against state-of-the-art baselines, establishing RevAvatar and VR-Face as a new benchmark for AI-enabled VR social presence.

Abstract

Virtual Reality (VR) headsets, while integral to the evolving digital ecosystem, present a critical challenge: the occlusion of users' eyes and portions of their faces, which hinders visual communication and may contribute to social isolation. To address this, we introduce RevAvatar, an innovative framework that leverages AI methodologies to enable reverse pass-through technology, fundamentally transforming VR headset design and interaction paradigms. RevAvatar integrates state-of-the-art generative models and multimodal AI techniques to reconstruct high-fidelity 2D facial images and generate accurate 3D head avatars from partially observed eye and lower-face regions. This framework represents a significant advancement in AI4Tech by enabling seamless interaction between virtual and physical environments, fostering immersive experiences such as VR meetings and social engagements. Additionally, we present VR-Face, a novel dataset comprising 200,000 samples designed to emulate diverse VR-specific conditions, including occlusions, lighting variations, and distortions. By addressing fundamental limitations in current VR systems, RevAvatar exemplifies the transformative synergy between AI and next-generation technologies, offering a robust platform for enhancing human connection and interaction in virtual environments.

Paper Structure

This paper contains 23 sections, 5 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: Our proposed RevAvatar framework for reverse pass-through, enabling the display of eyes and full-head avatars.
  • Figure 2: Sample processed images from the VR-Face dataset simulating VR environments.
  • Figure 3: A - Left, Right Eye and Face alignment models based on CycleGAN. B - The combined image is used as input for full face restoration.
  • Figure 4: Overview of the 2D Face Restoration and One-Shot Avatar generation model. 2D Face Restoration: Partial VR observations and the reference DP are inputs to the Input Encoder, while the reference image is processed by the Reference Encoder. The restored face output drives the one-shot avatars. One-Shot Avatar: The DP image serves as the source, and the restored image from the 2D face restoration model drives the avatar generation. A tri-plane is generated from concatenated encoder outputs, followed by volumetric rendering and super-resolution to produce the final output.
  • Figure 5: Sample output from 2D Face Restoration model which can be used for Reverse pass-through.
  • ...and 2 more figures