Table of Contents
Fetching ...

FRRffusion: Unveiling Authenticity with Diffusion-Based Face Retouching Reversal

Fengchuang Xing, Xiaowen Shi, Yuan-Gen Wang, Chunsheng Yang

TL;DR

This work introduces Face Retouching Reversal (FRR) to recover authentic facial appearance from retouched images, addressing a rising risk of deceptive online content. It builds the first FRR dataset, deepFRR, using StyleGAN-generated faces retouched via a commercial API, and presents FRRffusion, a two-stage diffusion-then-transformer framework with a diffusion-based FMAR for coarse structure and a Transformer-based HFDG for high-resolution detail synthesis. Across four evaluation metrics and multiple datasets, FRRffusion consistently outperforms GP-UNIT and Stable Diffusion in both quantitative scores (PSNR, SSIM, VGGS, CLIPS) and qualitative perceptual assessments, including a subjective study with 85 participants. The results suggest FRRffusion effectively bridges FRR with existing restoration tasks and highlights practical potential for authenticity verification in advertising and legal contexts, while also outlining avenues for improvement and robustness against evolving retouching technologies.

Abstract

Unveiling the real appearance of retouched faces to prevent malicious users from deceptive advertising and economic fraud has been an increasing concern in the era of digital economics. This article makes the first attempt to investigate the face retouching reversal (FRR) problem. We first collect an FRR dataset, named deepFRR, which contains 50,000 StyleGAN-generated high-resolution (1024*1024) facial images and their corresponding retouched ones by a commercial online API. To our best knowledge, deepFRR is the first FRR dataset tailored for training the deep FRR models. Then, we propose a novel diffusion-based FRR approach (FRRffusion) for the FRR task. Our FRRffusion consists of a coarse-to-fine two-stage network: A diffusion-based Facial Morpho-Architectonic Restorer (FMAR) is constructed to generate the basic contours of low-resolution faces in the first stage, while a Transformer-based Hyperrealistic Facial Detail Generator (HFDG) is designed to create high-resolution facial details in the second stage. Tested on deepFRR, our FRRffusion surpasses the GP-UNIT and Stable Diffusion methods by a large margin in four widespread quantitative metrics. Especially, the de-retouched images by our FRRffusion are visually much closer to the raw face images than both the retouched face images and those restored by the GP-UNIT and Stable Diffusion methods in terms of qualitative evaluation with 85 subjects. These results sufficiently validate the efficacy of our work, bridging the recently-standing gap between the FRR and generic image restoration tasks. The dataset and code are available at https://github.com/GZHU-DVL/FRRffusion.

FRRffusion: Unveiling Authenticity with Diffusion-Based Face Retouching Reversal

TL;DR

This work introduces Face Retouching Reversal (FRR) to recover authentic facial appearance from retouched images, addressing a rising risk of deceptive online content. It builds the first FRR dataset, deepFRR, using StyleGAN-generated faces retouched via a commercial API, and presents FRRffusion, a two-stage diffusion-then-transformer framework with a diffusion-based FMAR for coarse structure and a Transformer-based HFDG for high-resolution detail synthesis. Across four evaluation metrics and multiple datasets, FRRffusion consistently outperforms GP-UNIT and Stable Diffusion in both quantitative scores (PSNR, SSIM, VGGS, CLIPS) and qualitative perceptual assessments, including a subjective study with 85 participants. The results suggest FRRffusion effectively bridges FRR with existing restoration tasks and highlights practical potential for authenticity verification in advertising and legal contexts, while also outlining avenues for improvement and robustness against evolving retouching technologies.

Abstract

Unveiling the real appearance of retouched faces to prevent malicious users from deceptive advertising and economic fraud has been an increasing concern in the era of digital economics. This article makes the first attempt to investigate the face retouching reversal (FRR) problem. We first collect an FRR dataset, named deepFRR, which contains 50,000 StyleGAN-generated high-resolution (1024*1024) facial images and their corresponding retouched ones by a commercial online API. To our best knowledge, deepFRR is the first FRR dataset tailored for training the deep FRR models. Then, we propose a novel diffusion-based FRR approach (FRRffusion) for the FRR task. Our FRRffusion consists of a coarse-to-fine two-stage network: A diffusion-based Facial Morpho-Architectonic Restorer (FMAR) is constructed to generate the basic contours of low-resolution faces in the first stage, while a Transformer-based Hyperrealistic Facial Detail Generator (HFDG) is designed to create high-resolution facial details in the second stage. Tested on deepFRR, our FRRffusion surpasses the GP-UNIT and Stable Diffusion methods by a large margin in four widespread quantitative metrics. Especially, the de-retouched images by our FRRffusion are visually much closer to the raw face images than both the retouched face images and those restored by the GP-UNIT and Stable Diffusion methods in terms of qualitative evaluation with 85 subjects. These results sufficiently validate the efficacy of our work, bridging the recently-standing gap between the FRR and generic image restoration tasks. The dataset and code are available at https://github.com/GZHU-DVL/FRRffusion.
Paper Structure (22 sections, 17 equations, 11 figures, 5 tables, 2 algorithms)

This paper contains 22 sections, 17 equations, 11 figures, 5 tables, 2 algorithms.

Figures (11)

  • Figure 1: Example illustration of our deepFRR. We randomly select 10 face image pairs from the deepFRR dataset. Each pair consists of a raw AI-generated face image (first row) and its corresponding retouched one (second row).
  • Figure 2: Overview of the proposed FRRffusion framework. It includes Facial Morpho-Architectonic Restorer and Hyperrealistic Facial Detail Generator.
  • Figure 4: Qualitative comparison among different FRR methods.
  • Figure 5: Heatmap visualization of subjective evaluation results.
  • Figure 6: Qualitative illustration of our FRRffusion method on FRR effects. The three images of the same column are subjected to the same single type of face retouching operations, where all images are randomly selected from RetouchingFFHQ-3.
  • ...and 6 more figures