Table of Contents
Fetching ...

BLANKET: Anonymizing Faces in Infant Video Recordings

Ditmar Hadera, Jan Cech, Miroslav Purkrabek, Matej Hoffmann

TL;DR

BLANKET introduces an infant-focused anonymization pipeline that preserves facial attributes while changing identity in video via a diffusion-based inpainting step to generate a compatible new face, followed by temporally consistent face swapping with expression transfer. The approach emphasizes infant-domain suitability, temporal coherence, and downstream task preservation, outperforming a competing method (DeepPrivacy2) on metrics for de-identification, attribute retention, pose estimation impact, and artifact prevalence. Through extensive quantitative analyses and a user study on an infant video dataset, BLANKET demonstrates strong preservation of gaze, expressions, and head orientation, while maintaining high detection and pose AP in downstream tasks. The work highlights practical, privacy-preserving utilities for infant research datasets, with limitations related to face-detection reliability and identity leakage control, and points to future directions in generating more dissimilar compatible identities and improving artifact handling.

Abstract

Ensuring the ethical use of video data involving human subjects, particularly infants, requires robust anonymization methods. We propose BLANKET (Baby-face Landmark-preserving ANonymization with Keypoint dEtection consisTency), a novel approach designed to anonymize infant faces in video recordings while preserving essential facial attributes. Our method comprises two stages. First, a new random face, compatible with the original identity, is generated via inpainting using a diffusion model. Second, the new identity is seamlessly incorporated into each video frame through temporally consistent face swapping with authentic expression transfer. The method is evaluated on a dataset of short video recordings of babies and is compared to the popular anonymization method, DeepPrivacy2. Key metrics assessed include the level of de-identification, preservation of facial attributes, impact on human pose estimation (as an example of a downstream task), and presence of artifacts. Both methods alter the identity, and our method outperforms DeepPrivacy2 in all other respects. The code is available as an easy-to-use anonymization demo at https://github.com/ctu-vras/blanket-infant-face-anonym.

BLANKET: Anonymizing Faces in Infant Video Recordings

TL;DR

BLANKET introduces an infant-focused anonymization pipeline that preserves facial attributes while changing identity in video via a diffusion-based inpainting step to generate a compatible new face, followed by temporally consistent face swapping with expression transfer. The approach emphasizes infant-domain suitability, temporal coherence, and downstream task preservation, outperforming a competing method (DeepPrivacy2) on metrics for de-identification, attribute retention, pose estimation impact, and artifact prevalence. Through extensive quantitative analyses and a user study on an infant video dataset, BLANKET demonstrates strong preservation of gaze, expressions, and head orientation, while maintaining high detection and pose AP in downstream tasks. The work highlights practical, privacy-preserving utilities for infant research datasets, with limitations related to face-detection reliability and identity leakage control, and points to future directions in generating more dissimilar compatible identities and improving artifact handling.

Abstract

Ensuring the ethical use of video data involving human subjects, particularly infants, requires robust anonymization methods. We propose BLANKET (Baby-face Landmark-preserving ANonymization with Keypoint dEtection consisTency), a novel approach designed to anonymize infant faces in video recordings while preserving essential facial attributes. Our method comprises two stages. First, a new random face, compatible with the original identity, is generated via inpainting using a diffusion model. Second, the new identity is seamlessly incorporated into each video frame through temporally consistent face swapping with authentic expression transfer. The method is evaluated on a dataset of short video recordings of babies and is compared to the popular anonymization method, DeepPrivacy2. Key metrics assessed include the level of de-identification, preservation of facial attributes, impact on human pose estimation (as an example of a downstream task), and presence of artifacts. Both methods alter the identity, and our method outperforms DeepPrivacy2 in all other respects. The code is available as an easy-to-use anonymization demo at https://github.com/ctu-vras/blanket-infant-face-anonym.

Paper Structure

This paper contains 11 sections, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Infant face anonymization. Frame of the original video (a), anonymization by the proposed BLANKET method (b), and by DeepPrivacy2 deepprivacy2 (c). The proposed method alters the identity of the infant without introducing obvious perceptual artifacts, while keeping all other facial attributes intact, e.g. face orientation, gaze, expression. The method can handle face occlusion, which is challenging for the competing method. Original video courtesy of Stephen Julia, Max Family Fun.
  • Figure 2: Flowchart of the proposed video anonymization. First, an image replacing the original identity with a new compatible random identity from the first frame is created. The new face is generated by inpainting using Stable Diffusion stable-diffusion. Then, the new identity is swapped in every frame of the video. We propose to use FaceFusion Facefusion, as it provides temporally consistent results while preserving original facial expressions.
  • Figure 3: Flowchart of the Compatible-identity generator. A random identity is generated by inpainting algorithm, which makes the new identity compatible, well fitting, and seamlessly merged with the original image.
  • Figure 4: Anonymization could change output of a detector. Original image (a), black rectangle anonymization (b), DeepPrivacy2 (c) and BLANKET (d) with their respective bboxes detected by RTMDet RTMDet. Missing face information truncates the head (b) or enlarge detection (c).
  • Figure 5: Deformed face could lead to wrong number of detections. Original image (a) with it's bounding box detected by RTMDet RTMDet, black rectangle anonymization (b), DeepPrivacy2 (c) and BLANKET (d). Missing face information in (b) and (c) causes false positive detections (purple bboxes).
  • ...and 5 more figures