Table of Contents
Fetching ...

On the Importance of Conditioning for Privacy-Preserving Data Augmentation

Julian Lorenz, Katja Ludwig, Valentin Haug, Rainer Lienhart

TL;DR

This paper investigates privacy risks in privacy-preserving data augmentation that uses conditioned latent diffusion models. It shows that conditioning on edges and depth preserves identity-related cues, enabling a simple contrastive-learning attacker to re-identify individuals with substantial accuracy, and even enables black-box attacks relying solely on edge representations. Through extensive ablations on conditioning, backbones, and temperatures, the authors demonstrate that conditioned augmentations inherently leak information, with higher re-identification rates when more reference images are available. The findings challenge the viability of edge/depth-preserving diffusion-based anonymization and call for alternative privacy-preserving strategies in data augmentation with real-world implications for sensitive datasets.

Abstract

Latent diffusion models can be used as a powerful augmentation method to artificially extend datasets for enhanced training. To the human eye, these augmented images look very different to the originals. Previous work has suggested to use this data augmentation technique for data anonymization. However, we show that latent diffusion models that are conditioned on features like depth maps or edges to guide the diffusion process are not suitable as a privacy preserving method. We use a contrastive learning approach to train a model that can correctly identify people out of a pool of candidates. Moreover, we demonstrate that anonymization using conditioned diffusion models is susceptible to black box attacks. We attribute the success of the described methods to the conditioning of the latent diffusion model in the anonymization process. The diffusion model is instructed to produce similar edges for the anonymized images. Hence, a model can learn to recognize these patterns for identification.

On the Importance of Conditioning for Privacy-Preserving Data Augmentation

TL;DR

This paper investigates privacy risks in privacy-preserving data augmentation that uses conditioned latent diffusion models. It shows that conditioning on edges and depth preserves identity-related cues, enabling a simple contrastive-learning attacker to re-identify individuals with substantial accuracy, and even enables black-box attacks relying solely on edge representations. Through extensive ablations on conditioning, backbones, and temperatures, the authors demonstrate that conditioned augmentations inherently leak information, with higher re-identification rates when more reference images are available. The findings challenge the viability of edge/depth-preserving diffusion-based anonymization and call for alternative privacy-preserving strategies in data augmentation with real-world implications for sensitive datasets.

Abstract

Latent diffusion models can be used as a powerful augmentation method to artificially extend datasets for enhanced training. To the human eye, these augmented images look very different to the originals. Previous work has suggested to use this data augmentation technique for data anonymization. However, we show that latent diffusion models that are conditioned on features like depth maps or edges to guide the diffusion process are not suitable as a privacy preserving method. We use a contrastive learning approach to train a model that can correctly identify people out of a pool of candidates. Moreover, we demonstrate that anonymization using conditioned diffusion models is susceptible to black box attacks. We attribute the success of the described methods to the conditioning of the latent diffusion model in the anonymization process. The diffusion model is instructed to produce similar edges for the anonymized images. Hence, a model can learn to recognize these patterns for identification.

Paper Structure

This paper contains 25 sections, 2 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Overview of our method. An original image is anonymized with Instance Augmentation. Our model takes this image and an image database as an input and outputs latent representations for all images, encoding the person identity. We calculate the cosine similarity scores for the anonymized representation vector and the representations of all persons in the image database and select the image with the highest similarity.
  • Figure 2: Our contrastive learning framework. An image $I_{k_1}$ and an augmented image $\hat{I}_{k_2}$ originating from the same person are passed through the backbone $B$ and the projection head $H$. The similarity of the resulting representations $z_{k_1}$ and $\hat{z}_{k_2}$ is maximized during training.
  • Figure 3: Qualitative results for original images (top row) and augmented images (bottom row) from our dataset which is based on CelebA celeba.
  • Figure 4: Qualitative examples for augmented images with different conditioning based on CelebA celeba (Top) and corresponding edges detected with a HED detector hed (Bottom).
  • Figure 5: Edge transformations of the original image (left) using Canny edge detector canny (middle) and HED hed (right).