CognArtive: Large Language Models for Automating Art Analysis and Decoding Aesthetic Elements
Afshin Khadangi, Amir Sartipi, Igor Tchappi, Gilbert Fridgen
TL;DR
The paper tackles automating formal art analysis by applying multimodal large language models to decode technical and expressive elements of artworks. It introduces a pipeline that combines GPT-4V, Gemini 2.0, and GPT-4 to analyze over 15,000 works from 23 artists across 34 styles, guided by an eight-question criteria set. An embedding-based evaluation against ground-truth style descriptions uses four models to quantify how well automated analyses align with stylistic descriptors. The results reveal consistent patterns in form, color, light, movement, and technique over time, and they demonstrate the scalability and potential of AI-assisted art analysis for historians, educators, and enthusiasts.
Abstract
Art, as a universal language, can be interpreted in diverse ways, with artworks embodying profound meanings and nuances. The advent of Large Language Models (LLMs) and the availability of Multimodal Large Language Models (MLLMs) raise the question of how these transformative models can be used to assess and interpret the artistic elements of artworks. While research has been conducted in this domain, to the best of our knowledge, a deep and detailed understanding of the technical and expressive features of artworks using LLMs has not been explored. In this study, we investigate the automation of a formal art analysis framework to analyze a high-throughput number of artworks rapidly and examine how their patterns evolve over time. We explore how LLMs can decode artistic expressions, visual elements, composition, and techniques, revealing emerging patterns that develop across periods. Finally, we discuss the strengths and limitations of LLMs in this context, emphasizing their ability to process vast quantities of art-related data and generate insightful interpretations. Due to the exhaustive and granular nature of the results, we have developed interactive data visualizations, available online https://cognartive.github.io/, to enhance understanding and accessibility.
