Table of Contents
Fetching ...

When Algorithms Meet Artists: Semantic Compression of Artists' Concerns in the Public AI-Art Debate

Ariya Mukherjee-Gandhi, Oliver Muellerklein

TL;DR

This study investigates whether public discourse on AI-generated art proportionally represents artists' concerns. It builds a shared semantic map from 131 public documents (2013–2025) and projects 1,259 artist probes (34 frames across five dimensions: Threat, Utility, Ownership, Transparency, Compensation) into 22 topics, using a consensus-based semantic projection to ensure cross-corpus stability. The results show strong semantic compression: 95% of artist concerns concentrate in just 4 topics, while 14 topics (62% of discourse) contain no artist input, with governance concerns (ownership and transparency) being underrepresented by up to ~7× even after controlling for style. These findings reveal an epistemic marginalization in public AI-art discourse and offer a generalizable auditing method for assessing stakeholder representation in governance debates, with broad implications for policy design and other technology domains.

Abstract

Artists occupy a paradoxical position in generative AI: their work trains the models reshaping creative labor. We tested whether their concerns achieve proportional representation in public discourse shaping AI governance. Analyzing public AI-art discourse (news, podcasts, legal filings, research; 2013--2025) and projecting 1,259 survey-derived artist statements into this semantic space, we find stark compression: 95% of artist concerns cluster in 4 of 22 discourse topics, while 14 topics (62% of discourse) contain no artist perspective. This compression is selective - governance concerns (ownership, transparency) are 7x underrepresented; affective themes (threat, utility) show only 1.4x underrepresentation after style controls. The pattern indicates semantic, not stylistic, marginalization. These findings demonstrate a measurable representational gap: decision-makers relying on public discourse as a proxy for stakeholder priorities will systematically underweight those most affected. We introduce a consensus-based semantic projection methodology that is currently being validated across domains and generalizes to other stakeholder-technology contexts.

When Algorithms Meet Artists: Semantic Compression of Artists' Concerns in the Public AI-Art Debate

TL;DR

This study investigates whether public discourse on AI-generated art proportionally represents artists' concerns. It builds a shared semantic map from 131 public documents (2013–2025) and projects 1,259 artist probes (34 frames across five dimensions: Threat, Utility, Ownership, Transparency, Compensation) into 22 topics, using a consensus-based semantic projection to ensure cross-corpus stability. The results show strong semantic compression: 95% of artist concerns concentrate in just 4 topics, while 14 topics (62% of discourse) contain no artist input, with governance concerns (ownership and transparency) being underrepresented by up to ~7× even after controlling for style. These findings reveal an epistemic marginalization in public AI-art discourse and offer a generalizable auditing method for assessing stakeholder representation in governance debates, with broad implications for policy design and other technology domains.

Abstract

Artists occupy a paradoxical position in generative AI: their work trains the models reshaping creative labor. We tested whether their concerns achieve proportional representation in public discourse shaping AI governance. Analyzing public AI-art discourse (news, podcasts, legal filings, research; 2013--2025) and projecting 1,259 survey-derived artist statements into this semantic space, we find stark compression: 95% of artist concerns cluster in 4 of 22 discourse topics, while 14 topics (62% of discourse) contain no artist perspective. This compression is selective - governance concerns (ownership, transparency) are 7x underrepresented; affective themes (threat, utility) show only 1.4x underrepresentation after style controls. The pattern indicates semantic, not stylistic, marginalization. These findings demonstrate a measurable representational gap: decision-makers relying on public discourse as a proxy for stakeholder priorities will systematically underweight those most affected. We introduce a consensus-based semantic projection methodology that is currently being validated across domains and generalizes to other stakeholder-technology contexts.

Paper Structure

This paper contains 67 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Artist concerns concentrate in a narrow region of public discourse. Distribution of 1,259 artist probes across 22 public discourse topics. Just 4 topics capture 95.4% of artist concerns, while 14 topics (62.4% of public discourse volume) contain no artist perspective whatsoever. The 34 distinct frames articulated by artists---representing substantively different positions on ownership, transparency, compensation, threat, and utility---collapse into this narrow semantic region.
  • Figure 2: Governance concerns are most marginalized. Salience ratios by concern theme, comparing artist probe concentration to public discourse concentration in theme-relevant topic regions. Values $>1$ indicate artist overconcentration (public under-emphasis). Governance themes (Ownership, Transparency) show the strongest underrepresentation at 6.95$\times$, while affective themes (Threat, Utility) show moderate underrepresentation at 4.81$\times$. The concerns most actionable for policy are precisely those most systematically absent from public discourse.
  • Figure 3: Two-dimensional UMAP projection of the public discourse semantic space showing all 22 topic clusters. Topics containing artist perspectives are circled in red (Topics 1, 8, 9, 10, 16, 17, 18). Artist concerns concentrate in a narrow region of the semantic space, while the majority of topics (particularly those in the upper portion of the map) contain no artist perspective.
  • Figure 4: Semantic pathway concentration analysis. The top 5 pathways account for 73.4% of all artists, and the top 10 pathways account for 89.4%, indicating stable cross-theme patterns rather than idiosyncratic scattering. Notably, governance concerns (Ownership and Transparency) exhibited the greatest internal diversity among the five concern dimensions---yet they are arguably the most compressed: Cluster 17 appears in all top 10 pathways for both Ownership and Transparency themes, collapsing this nuance into a single discursive region. By contrast, affective themes (Threat, Utility) distribute across Clusters 10 and 16, achieving slightly broader discursive footprint.
  • Figure 5: Consensus UMAP methodology. We generate 31 UMAP projections with different random seeds, compute pairwise distance matrices for each, average these matrices, and fit a final embedding from the consensus distance structure. This approach increases average seed-to-consensus Adjusted Rand Index from 0.56 (naive coordinate averaging) to 0.71 (distance-matrix consensus), yielding a more stable reference geometry.