Table of Contents
Fetching ...

Promptly Yours? A Human Subject Study on Prompt Inference in AI-Generated Art

Khoi Trinh, Joseph Spracklen, Raveen Wijewickrama, Bimal Viswanath, Murtuza Jadliwala, Anindya Maiti

TL;DR

It is indicated that while humans and human-AI collaborations can infer prompts and generate similar images with high accuracy, they are not as successful as using the original prompt.

Abstract

The emerging field of AI-generated art has witnessed the rise of prompt marketplaces, where creators can purchase, sell, or share prompts for generating unique artworks. These marketplaces often assert ownership over prompts, claiming them as intellectual property. This paper investigates whether concealed prompts sold on prompt marketplaces can be considered as secure intellectual property, given that humans and AI tools may be able to approximately infer the prompts based on publicly advertised sample images accompanying each prompt on sale. Specifically, our survey aims to assess (i) how accurately can humans infer the original prompt solely by examining an AI-generated image, with the goal of generating images similar to the original image, and (ii) the possibility of improving upon individual human and AI prompt inferences by crafting human-AI combined prompts with the help of a large language model. Although previous research has explored the use of AI and machine learning to infer (and also protect against) prompt inference, we are the first to include humans in the loop. Our findings indicate that while humans and human-AI collaborations can infer prompts and generate similar images with high accuracy, they are not as successful as using the original prompt.

Promptly Yours? A Human Subject Study on Prompt Inference in AI-Generated Art

TL;DR

It is indicated that while humans and human-AI collaborations can infer prompts and generate similar images with high accuracy, they are not as successful as using the original prompt.

Abstract

The emerging field of AI-generated art has witnessed the rise of prompt marketplaces, where creators can purchase, sell, or share prompts for generating unique artworks. These marketplaces often assert ownership over prompts, claiming them as intellectual property. This paper investigates whether concealed prompts sold on prompt marketplaces can be considered as secure intellectual property, given that humans and AI tools may be able to approximately infer the prompts based on publicly advertised sample images accompanying each prompt on sale. Specifically, our survey aims to assess (i) how accurately can humans infer the original prompt solely by examining an AI-generated image, with the goal of generating images similar to the original image, and (ii) the possibility of improving upon individual human and AI prompt inferences by crafting human-AI combined prompts with the help of a large language model. Although previous research has explored the use of AI and machine learning to infer (and also protect against) prompt inference, we are the first to include humans in the loop. Our findings indicate that while humans and human-AI collaborations can infer prompts and generate similar images with high accuracy, they are not as successful as using the original prompt.

Paper Structure

This paper contains 52 sections, 2 equations, 29 figures.

Figures (29)

  • Figure 1: Image generations using SDXL with prompts containing the same subject (cat) and different combinations of two modifiers (pixel art and dark colors).
  • Figure 2: Participants' familiarity levels with different generative AI tools.
  • Figure 3: Analysis of multiple answer question (MSQ) responses from Part I, aimed at identifying disparities in (a) four distinct txt2img models, (b) various subjects depicted in the images, (c) the impact of participants' arts background, and (d) variations attributable to recruitment sources.
  • Figure 4: Semantic similarity and CLIP score for different txt2img models, in (a) controlled dataset, and (b) uncontrolled dataset. Success thresholds are depicted as dashed lines, green dashed line for the L14 model and red dashed line for the B32 model.
  • Figure 5: Perceptual similarity and image hash scores for images generated using the four different txt2img models.
  • ...and 24 more figures