Imitating the Functionality of Image-to-Image Models Using a Single Example
Nurit Spingarn-Eliezer, Tomer Michaeli
TL;DR
This work demonstrates that the functionality of many image-to-image translation models can be faithfully imitated from a single input-output example, by training an imitating model to reproduce the observed output with a simple L2 objective. The authors connect this vulnerability to the effective receptive field, showing that small local regions within the single example provide rich training signals, and they systematically study factors such as degradation type, example choice, target loss, and architecture. They validate the phenomenon across diverse biological and medical translation tasks, including virtual staining and tissue-type prediction, finding high fidelity imitation (PSNR often > 31 dB) even when the target model’s data, architecture, and training details are unknown. The paper discusses practical defenses (watermarking and JPEG compression) and highlights security implications, suggesting that more robust defenses and further research into the local behavior of image translation models are warranted.
Abstract
We study the possibility of imitating the functionality of an image-to-image translation model by observing input-output pairs. We focus on cases where training the model from scratch is impossible, either because training data are unavailable or because the model architecture is unknown. This is the case, for example, with commercial models for biological applications. Since the development of these models requires large investments, their owners commonly keep them confidential, and reveal only a few input-output examples on the company's website or in an academic paper. Surprisingly, we find that even a single example typically suffices for learning to imitate the model's functionality, and that this can be achieved using a simple distillation approach. We present an extensive ablation study encompassing a wide variety of model architectures, datasets and tasks, to characterize the factors affecting vulnerability to functionality imitation, and provide a preliminary theoretical discussion on the reasons for this unwanted behavior.
