Carousel: A High-Resolution Dataset for Multi-Target Automatic Image Cropping
Rafe Loya, Andrew Hamara, Benjamin Estell, Benjamin Kilpatrick, Andrew C. Freeman
TL;DR
This work tackles multi-target image cropping for high-resolution images, addressing the need for distinct, aesthetically pleasing crops within a single photo. It introduces Carousel, a high-resolution dataset of 277 images with ground-truth multi-target crops and corresponding metadata, and demonstrates a partition-based pre-processing step that enables existing single-target croppers to produce multiple crops. Through a kIoU evaluation framework, the study shows GAICv2 performs best when combined with the partitioning approach, while highlighting limitations and failure cases that motivate future end-to-end multi-target cropping models. The dataset and evaluation protocol set the stage for developing direct multi-target cropping methods and expanding high-resolution datasets for richer, swipable social-media experiences.
Abstract
Automatic image cropping is a method for maximizing the human-perceived quality of cropped regions in photographs. Although several works have proposed techniques for producing singular crops, little work has addressed the problem of producing multiple, distinct crops with aesthetic appeal. In this paper, we motivate the problem with a discussion on modern social media applications, introduce a dataset of 277 relevant images and human labels, and evaluate the efficacy of several single-crop models with an image partitioning algorithm as a pre-processing step. The dataset is available at https://github.com/RafeLoya/carousel.
