Representing Beauty: Towards a Participatory but Objective Latent Aesthetics
Alexander Michael Rusnak
TL;DR
This work interrogates what it means for a machine to recognize beauty, proposing that aesthetic representations converge across diverse models and modalities, implying a realist, universal geometry of beauty. It introduces the universal representation hypothesis and anchors it in hylomorphic principles, arguing that form is inseparable from material constraints and that beauty functions as a teleological binder in latent spaces. Empirically, it notes higher embedding self-similarity for aesthetic content and greater cross-model alignment, especially in mid-layer representations, suggesting a hierarchical abstraction from particular to transcendent concepts. The study frames human co-creation as foundational, with machines capable of offering novel insights at scale while remaining grounded in human intentions and cultural processes, signaling a productive partnership in cultural production and analysis.
Abstract
What does it mean for a machine to recognize beauty? While beauty remains a culturally and experientially compelling but philosophically elusive concept, deep learning systems increasingly appear capable of modeling aesthetic judgment. In this paper, we explore the capacity of neural networks to represent beauty despite the immense formal diversity of objects for which the term applies. By drawing on recent work on cross-model representational convergence, we show how aesthetic content produces more similar and aligned representations between models which have been trained on distinct data and modalities - while unaesthetic images do not produce more aligned representations. This finding implies that the formal structure of beautiful images has a realist basis - rather than only as a reflection of socially constructed values. Furthermore, we propose that these realist representations exist because of a joint grounding of aesthetic form in physical and cultural substance. We argue that human perceptual and creative acts play a central role in shaping these the latent spaces of deep learning systems, but that a realist basis for aesthetics shows that machines are not mere creative parrots but can produce novel creative insights from the unique vantage point of scale. Our findings suggest that human-machine co-creation is not merely possible, but foundational - with beauty serving as a teleological attractor in both cultural production and machine perception.
