Table of Contents
Fetching ...

Civiverse: A Dataset for Analyzing User Engagement with Open-Source Text-to-Image Models

Maria-Teresa De Rosa Palmini, Laura Wagner, Eva Cetinic

TL;DR

Open-source text-to-image systems raise cultural and ethical concerns not fully captured by model outputs or training data alone. The paper introduces Civiverse 6M, a large-scale prompt dataset from CivitAI, and presents a prompt-semantics analysis pipeline using MiniLMv2 embeddings, UMAP, and HDBSCAN to derive topics, along with NAMED-ENTITY analysis of artist names. Findings show a strong preference for explicit content, prevalent style mimicry via artist names, and increasing NSFW content, indicating risks of misogyny and visual homogenization in open-source TTI ecosystems. These insights advocate for more responsible design, policy considerations, and future work analyzing model configurations and generation parameters on open platforms.

Abstract

Text-to-image (TTI) systems, particularly those utilizing open-source frameworks, have become increasingly prevalent in the production of Artificial Intelligence (AI)-generated visuals. While existing literature has explored various problematic aspects of TTI technologies, such as bias in generated content, intellectual property concerns, and the reinforcement of harmful stereotypes, open-source TTI frameworks have not yet been systematically examined from a cultural perspective. This study addresses this gap by analyzing the CivitAI platform, a leading open-source platform dedicated to TTI AI. We introduce the Civiverse prompt dataset, encompassing millions of images and related metadata. We focus on prompt analysis, specifically examining the semantic characteristics of text prompts, as it is crucial for addressing societal issues related to generative technologies. This analysis provides insights into user intentions, preferences, and behaviors, which in turn shape the outputs of these models. Our findings reveal a predominant preference for generating explicit content, along with a focus on homogenization of semantic content. These insights underscore the need for further research into the perpetuation of misogyny, harmful stereotypes, and the uniformity of visual culture within these models.

Civiverse: A Dataset for Analyzing User Engagement with Open-Source Text-to-Image Models

TL;DR

Open-source text-to-image systems raise cultural and ethical concerns not fully captured by model outputs or training data alone. The paper introduces Civiverse 6M, a large-scale prompt dataset from CivitAI, and presents a prompt-semantics analysis pipeline using MiniLMv2 embeddings, UMAP, and HDBSCAN to derive topics, along with NAMED-ENTITY analysis of artist names. Findings show a strong preference for explicit content, prevalent style mimicry via artist names, and increasing NSFW content, indicating risks of misogyny and visual homogenization in open-source TTI ecosystems. These insights advocate for more responsible design, policy considerations, and future work analyzing model configurations and generation parameters on open platforms.

Abstract

Text-to-image (TTI) systems, particularly those utilizing open-source frameworks, have become increasingly prevalent in the production of Artificial Intelligence (AI)-generated visuals. While existing literature has explored various problematic aspects of TTI technologies, such as bias in generated content, intellectual property concerns, and the reinforcement of harmful stereotypes, open-source TTI frameworks have not yet been systematically examined from a cultural perspective. This study addresses this gap by analyzing the CivitAI platform, a leading open-source platform dedicated to TTI AI. We introduce the Civiverse prompt dataset, encompassing millions of images and related metadata. We focus on prompt analysis, specifically examining the semantic characteristics of text prompts, as it is crucial for addressing societal issues related to generative technologies. This analysis provides insights into user intentions, preferences, and behaviors, which in turn shape the outputs of these models. Our findings reveal a predominant preference for generating explicit content, along with a focus on homogenization of semantic content. These insights underscore the need for further research into the perpetuation of misogyny, harmful stereotypes, and the uniformity of visual culture within these models.
Paper Structure (11 sections, 6 figures, 5 tables)

This paper contains 11 sections, 6 figures, 5 tables.

Figures (6)

  • Figure 1: A typical text-to-image prompt (a) and its associated image generation output (b).
  • Figure 2: Histogram daily uploaded images, spanning October 2023 - April 2024 color coded by their classification assigned by CivitAI.
  • Figure 3: 8 most popular topics for positive prompts of the Civiverse dataset.
  • Figure 4: Visualization of positive prompt specifiers of the all-MiniLM-L6-v2 embeddings.
  • Figure 5: 8 most popular topics for negative prompts of the Civiverse dataset.
  • ...and 1 more figures