SynthForge: Synthesizing High-Quality Face Dataset with Controllable 3D Generative Models
Abhay Rawat, Shubham Dokania, Astitva Srivastava, Shuaib Ahmed, Haiwen Feng, Rahul Tallamraju
TL;DR
SynthForge introduces a 3D-aware synthetic data pipeline that leverages Next3D and FLAME-based 3DMMs to generate high-fidelity face images with dense multi-task annotations. A Multi-Task Synthetic Backbone is trained on this synthetic data to predict segmentation, depth, and keypoints, followed by Stage II label finetuning on real data to bridge domain gaps. The approach yields competitive results on benchmarks like 300W, LaPa, and CelebAMask-HQ, while significantly reducing data collection costs (e.g., 100k images in ~7 hours on a single V100). By releasing the curated synthetic dataset, pretrained models, and code, SynthForge provides a scalable path for realistic, controllable synthetic data in facial analysis tasks.
Abstract
Recent advancements in generative models have unlocked the capabilities to render photo-realistic data in a controllable fashion. Trained on the real data, these generative models are capable of producing realistic samples with minimal to no domain gap, as compared to the traditional graphics rendering. However, using the data generated using such models for training downstream tasks remains under-explored, mainly due to the lack of 3D consistent annotations. Moreover, controllable generative models are learned from massive data and their latent space is often too vast to obtain meaningful sample distributions for downstream task with limited generation. To overcome these challenges, we extract 3D consistent annotations from an existing controllable generative model, making the data useful for downstream tasks. Our experiments show competitive performance against state-of-the-art models using only generated synthetic data, demonstrating potential for solving downstream tasks. Project page: https://synth-forge.github.io
