AACP: Aesthetics assessment of children's paintings based on self-supervised learning
Shiqi Jiang, Ning Li, Chen Shi, Liping Guo, Changbo Wang, Chenhui Li
TL;DR
This paper tackles the challenge of aesthetics assessment for children's paintings (AACP) under limited labeled data by building a dedicated dataset with 20k unlabeled and 1.2k labeled images across eight attributes, and by proposing a self-supervised, four-module architecture. The model integrates a ConvMAE-inspired encoder with a Spatial Perception Network, a Channel Perception Network, and a Disentangled Evaluation Network, trained in two stages and enhanced by DALL-E-based data augmentation. Across extensive qualitative, quantitative, and user studies, the approach achieves state-of-the-art performance on AACP and demonstrates robust annotation quality and dataset expansion viability. The work advances aesthetics education by enabling more objective, multi-faceted assessment of children’s artistic development and provides a foundation for future research into environmental factors and broader attribute sets.
Abstract
The Aesthetics Assessment of Children's Paintings (AACP) is an important branch of the image aesthetics assessment (IAA), playing a significant role in children's education. This task presents unique challenges, such as limited available data and the requirement for evaluation metrics from multiple perspectives. However, previous approaches have relied on training large datasets and subsequently providing an aesthetics score to the image, which is not applicable to AACP. To solve this problem, we construct an aesthetics assessment dataset of children's paintings and a model based on self-supervised learning. 1) We build a novel dataset composed of two parts: the first part contains more than 20k unlabeled images of children's paintings; the second part contains 1.2k images of children's paintings, and each image contains eight attributes labeled by multiple design experts. 2) We design a pipeline that includes a feature extraction module, perception modules and a disentangled evaluation module. 3) We conduct both qualitative and quantitative experiments to compare our model's performance with five other methods using the AACP dataset. Our experiments reveal that our method can accurately capture aesthetic features and achieve state-of-the-art performance.
