Table of Contents
Fetching ...

Simple Lines, Big Ideas: Towards Interpretable Assessment of Human Creativity from Drawings

Zihao Lin, Zhenshan Shi, Sasa Zhao, Hanwei Zhu, Lingyu Zhu, Baoliang Chen, Lei Mo

TL;DR

This work tackles automatic creativity assessment from drawings by modeling creativity as emerging from both content and style, addressing the shortcomings of expert-only scoring and opaque models. It introduces Content-Style conditioned Creativity Assessment (CSCA), a unified multi-task framework that jointly predicts content category, style descriptors, and a creativity rating by modulating CLIP embeddings with content- and style-sensitive cues, formalized with the loss $\mathcal{L} = \mathcal{L}_{\text{reg}} + \lambda \mathcal{L}_{\text{cls}}$ and predicting $\hat{q}(I)$ through an alignment with creativity prompts $F_{T_s}$ via cosine similarity. Core contributions include dataset augmentation with semantic content labels and a style proxy, learnable creativity rating embeddings, and the conditional tuning mechanisms (Content Conditional Tuner and Style Conditional Tuner) that yield interpretable, cross-task generalization. Empirically, CSCA achieves state-of-the-art SRCC/PLCC on the primary set ($0.86/0.87$) and demonstrates strong generalization across unseen raters and tasks (RG1: $0.82/0.79$, RG2: $0.74/0.73$, FG: $0.48/0.49$), with ablations confirming the value of each component. The approach offers a scalable, explainable alternative to subjective scoring with potential applications in education and cognitive science, and future work may extend to cross-cultural generalization and multi-modal creativity domains.

Abstract

Assessing human creativity through visual outputs, such as drawings, plays a critical role in fields including psychology, education, and cognitive science. However, current assessment practices still rely heavily on expert-based subjective scoring, which is both labor-intensive and inherently subjective. In this paper, we propose a data-driven framework for automatic and interpretable creativity assessment from drawings. Motivated by the cognitive evidence proposed in [6] that creativity can emerge from both what is drawn (content) and how it is drawn (style), we reinterpret the creativity score as a function of these two complementary dimensions. Specifically, we first augment an existing creativity-labeled dataset with additional annotations targeting content categories. Based on the enriched dataset, we further propose a conditional model predicting content, style, and ratings simultaneously. In particular, the conditional learning mechanism that enables the model to adapt its visual feature extraction by dynamically tuning it to creativity-relevant signals conditioned on the drawing's stylistic and semantic cues. Experimental results demonstrate that our model achieves state-of-the-art performance compared to existing regression-based approaches and offers interpretable visualizations that align well with human judgments. The code and annotations will be made publicly available at https://github.com/WonderOfU9/CSCA_PRCV_2025

Simple Lines, Big Ideas: Towards Interpretable Assessment of Human Creativity from Drawings

TL;DR

This work tackles automatic creativity assessment from drawings by modeling creativity as emerging from both content and style, addressing the shortcomings of expert-only scoring and opaque models. It introduces Content-Style conditioned Creativity Assessment (CSCA), a unified multi-task framework that jointly predicts content category, style descriptors, and a creativity rating by modulating CLIP embeddings with content- and style-sensitive cues, formalized with the loss and predicting through an alignment with creativity prompts via cosine similarity. Core contributions include dataset augmentation with semantic content labels and a style proxy, learnable creativity rating embeddings, and the conditional tuning mechanisms (Content Conditional Tuner and Style Conditional Tuner) that yield interpretable, cross-task generalization. Empirically, CSCA achieves state-of-the-art SRCC/PLCC on the primary set () and demonstrates strong generalization across unseen raters and tasks (RG1: , RG2: , FG: ), with ablations confirming the value of each component. The approach offers a scalable, explainable alternative to subjective scoring with potential applications in education and cognitive science, and future work may extend to cross-cultural generalization and multi-modal creativity domains.

Abstract

Assessing human creativity through visual outputs, such as drawings, plays a critical role in fields including psychology, education, and cognitive science. However, current assessment practices still rely heavily on expert-based subjective scoring, which is both labor-intensive and inherently subjective. In this paper, we propose a data-driven framework for automatic and interpretable creativity assessment from drawings. Motivated by the cognitive evidence proposed in [6] that creativity can emerge from both what is drawn (content) and how it is drawn (style), we reinterpret the creativity score as a function of these two complementary dimensions. Specifically, we first augment an existing creativity-labeled dataset with additional annotations targeting content categories. Based on the enriched dataset, we further propose a conditional model predicting content, style, and ratings simultaneously. In particular, the conditional learning mechanism that enables the model to adapt its visual feature extraction by dynamically tuning it to creativity-relevant signals conditioned on the drawing's stylistic and semantic cues. Experimental results demonstrate that our model achieves state-of-the-art performance compared to existing regression-based approaches and offers interpretable visualizations that align well with human judgments. The code and annotations will be made publicly available at https://github.com/WonderOfU9/CSCA_PRCV_2025

Paper Structure

This paper contains 17 sections, 12 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: (a) Example drawings from a figural creativity task rated as creative (left) versus uncreative (right). (b) Comparison of evaluation processes: the traditional expert-based scoring pipeline (top) versus our proposed automated assessment model (bottom).
  • Figure 2:
  • Figure 3: Correlation between normalized ink intensity and human creativity ratings across different content categories.
  • Figure 4: Mean human creativity ratings across content categories, grouped by ink intensity levels.