Table of Contents
Fetching ...

Imitating the Functionality of Image-to-Image Models Using a Single Example

Nurit Spingarn-Eliezer, Tomer Michaeli

TL;DR

This work demonstrates that the functionality of many image-to-image translation models can be faithfully imitated from a single input-output example, by training an imitating model to reproduce the observed output with a simple L2 objective. The authors connect this vulnerability to the effective receptive field, showing that small local regions within the single example provide rich training signals, and they systematically study factors such as degradation type, example choice, target loss, and architecture. They validate the phenomenon across diverse biological and medical translation tasks, including virtual staining and tissue-type prediction, finding high fidelity imitation (PSNR often > 31 dB) even when the target model’s data, architecture, and training details are unknown. The paper discusses practical defenses (watermarking and JPEG compression) and highlights security implications, suggesting that more robust defenses and further research into the local behavior of image translation models are warranted.

Abstract

We study the possibility of imitating the functionality of an image-to-image translation model by observing input-output pairs. We focus on cases where training the model from scratch is impossible, either because training data are unavailable or because the model architecture is unknown. This is the case, for example, with commercial models for biological applications. Since the development of these models requires large investments, their owners commonly keep them confidential, and reveal only a few input-output examples on the company's website or in an academic paper. Surprisingly, we find that even a single example typically suffices for learning to imitate the model's functionality, and that this can be achieved using a simple distillation approach. We present an extensive ablation study encompassing a wide variety of model architectures, datasets and tasks, to characterize the factors affecting vulnerability to functionality imitation, and provide a preliminary theoretical discussion on the reasons for this unwanted behavior.

Imitating the Functionality of Image-to-Image Models Using a Single Example

TL;DR

This work demonstrates that the functionality of many image-to-image translation models can be faithfully imitated from a single input-output example, by training an imitating model to reproduce the observed output with a simple L2 objective. The authors connect this vulnerability to the effective receptive field, showing that small local regions within the single example provide rich training signals, and they systematically study factors such as degradation type, example choice, target loss, and architecture. They validate the phenomenon across diverse biological and medical translation tasks, including virtual staining and tissue-type prediction, finding high fidelity imitation (PSNR often > 31 dB) even when the target model’s data, architecture, and training details are unknown. The paper discusses practical defenses (watermarking and JPEG compression) and highlights security implications, suggesting that more robust defenses and further research into the local behavior of image translation models are warranted.

Abstract

We study the possibility of imitating the functionality of an image-to-image translation model by observing input-output pairs. We focus on cases where training the model from scratch is impossible, either because training data are unavailable or because the model architecture is unknown. This is the case, for example, with commercial models for biological applications. Since the development of these models requires large investments, their owners commonly keep them confidential, and reveal only a few input-output examples on the company's website or in an academic paper. Surprisingly, we find that even a single example typically suffices for learning to imitate the model's functionality, and that this can be achieved using a simple distillation approach. We present an extensive ablation study encompassing a wide variety of model architectures, datasets and tasks, to characterize the factors affecting vulnerability to functionality imitation, and provide a preliminary theoretical discussion on the reasons for this unwanted behavior.
Paper Structure (44 sections, 1 theorem, 1 equation, 33 figures, 19 tables)

This paper contains 44 sections, 1 theorem, 1 equation, 33 figures, 19 tables.

Key Result

Theorem 1

Assume $f_{\theta}$ is a single-layer linear shift-equivariant model with receptive field $w \times h$, whose input and output are single-channel images of the same dimensions (e.g. due to the use of zero-padding). Then using an example image of size ${n}\times{m}$ to optimize eq:UserDefinedNonlinOb

Figures (33)

  • Figure 1: Stealing a black-box image-to-image translation model from the figure of a published paper. We show that the functionality of many image-to-image translation models can be replicated by observing only a single input-output example from the model. Such an example can be obtained e.g. from the figure of a published paper or from a demo on a company's website. Here, we extract a single input-output pair from Fig. 2 in liu2022instant, which showcases a model for spectrally resolving femto-stimulated Raman scattering images. We train an "imitating" U-Net model $g_{\theta}$ on that single pair. Then, we test our imitating model on unseen pairs taken from different locations within the same figure. Our imitating model's outputs are almost indistinguishable from the original model's outputs.
  • Figure 2: Imitating the functionality of biological image-to-image translation models. On the left of each pane we show the single input-output pair we used for imitating the model (gray background). On the right, we compare between the outputs of our imitating model and of the original black-box model on test images. For visualization purposes we show only $128\times128$ crops.
  • Figure 3: Receptive field, image size, and capacity. The plots depict the PSNR between the outputs of the imitating model and the target model. (a) PSNR as a function of the receptive field. (b) PSNR as a function of model size with a fixed receptive field of $31\times31$. (c) PSNR as a function of image size with a fixed receptive field of $31\times31$.
  • Figure 4: Effective receptive fields of denoisers. The plots depict the absolute difference between the outputs of each model for two input images that are completely identical, apart for a small change in the center pixel. The results are averaged over the entire DIV2K test set, contaminated by noise of level $25$ (as expected by the models). Note the logarithmic color scale.
  • Figure 5: Imitating the functionality of restoration models. The figure shows imitation of the SCUNet model for (nonblind) denoising and the Restormer model for (blind) de-raining. The left panes depict the single examples used for imitation. The right pane compares the outputs of the imitating models to those of the target models on test images. See SM for many more visual examples.
  • ...and 28 more figures

Theorems & Definitions (1)

  • Theorem 1: rephrased from phuong2019towards