Mind-to-Face: Neural-Driven Photorealistic Avatar Synthesis via EEG Decoding
Haolin Xiong, Tianwen Fu, Pratusha Bhuvana Prasad, Yunxuan Cai, Haiwei Chen, Wenbin Teng, Hanyuan Xiao, Yajie Zhao
TL;DR
Mind-to-Face introduces the first framework that converts non-invasive EEG signals into photorealistic 3D facial expressions by mapping EEG windows to dense 3D position maps and rendering them with 3D Gaussian Splatting. The approach relies on a synchronized dual-modality dataset of 16-channel EEG and high-speed multi-view video, using a CNN–Transformer encoder to produce dense geometry supervised by photogrammetric ground truth. It emphasizes personalized neural-to-expression mappings to capture subject-specific dynamics and demonstrates high-fidelity, view-consistent avatars even under occlusion. This work unlocks emotion-aware telepresence and cognitive interaction by leveraging neural activity to drive realistic facial synthesis without visible facial recordings.
Abstract
Current expressive avatar systems rely heavily on visual cues, failing when faces are occluded or when emotions remain internal. We present Mind-to-Face, the first framework that decodes non-invasive electroencephalogram (EEG) signals directly into high-fidelity facial expressions. We build a dual-modality recording setup to obtain synchronized EEG and multi-view facial video during emotion-eliciting stimuli, enabling precise supervision for neural-to-visual learning. Our model uses a CNN-Transformer encoder to map EEG signals into dense 3D position maps, capable of sampling over 65k vertices, capturing fine-scale geometry and subtle emotional dynamics, and renders them through a modified 3D Gaussian Splatting pipeline for photorealistic, view-consistent results. Through extensive evaluation, we show that EEG alone can reliably predict dynamic, subject-specific facial expressions, including subtle emotional responses, demonstrating that neural signals contain far richer affective and geometric information than previously assumed. Mind-to-Face establishes a new paradigm for neural-driven avatars, enabling personalized, emotion-aware telepresence and cognitive interaction in immersive environments.
