Table of Contents
Fetching ...

Carousel: A High-Resolution Dataset for Multi-Target Automatic Image Cropping

Rafe Loya, Andrew Hamara, Benjamin Estell, Benjamin Kilpatrick, Andrew C. Freeman

TL;DR

This work tackles multi-target image cropping for high-resolution images, addressing the need for distinct, aesthetically pleasing crops within a single photo. It introduces Carousel, a high-resolution dataset of 277 images with ground-truth multi-target crops and corresponding metadata, and demonstrates a partition-based pre-processing step that enables existing single-target croppers to produce multiple crops. Through a kIoU evaluation framework, the study shows GAICv2 performs best when combined with the partitioning approach, while highlighting limitations and failure cases that motivate future end-to-end multi-target cropping models. The dataset and evaluation protocol set the stage for developing direct multi-target cropping methods and expanding high-resolution datasets for richer, swipable social-media experiences.

Abstract

Automatic image cropping is a method for maximizing the human-perceived quality of cropped regions in photographs. Although several works have proposed techniques for producing singular crops, little work has addressed the problem of producing multiple, distinct crops with aesthetic appeal. In this paper, we motivate the problem with a discussion on modern social media applications, introduce a dataset of 277 relevant images and human labels, and evaluate the efficacy of several single-crop models with an image partitioning algorithm as a pre-processing step. The dataset is available at https://github.com/RafeLoya/carousel.

Carousel: A High-Resolution Dataset for Multi-Target Automatic Image Cropping

TL;DR

This work tackles multi-target image cropping for high-resolution images, addressing the need for distinct, aesthetically pleasing crops within a single photo. It introduces Carousel, a high-resolution dataset of 277 images with ground-truth multi-target crops and corresponding metadata, and demonstrates a partition-based pre-processing step that enables existing single-target croppers to produce multiple crops. Through a kIoU evaluation framework, the study shows GAICv2 performs best when combined with the partitioning approach, while highlighting limitations and failure cases that motivate future end-to-end multi-target cropping models. The dataset and evaluation protocol set the stage for developing direct multi-target cropping methods and expanding high-resolution datasets for richer, swipable social-media experiences.

Abstract

Automatic image cropping is a method for maximizing the human-perceived quality of cropped regions in photographs. Although several works have proposed techniques for producing singular crops, little work has addressed the problem of producing multiple, distinct crops with aesthetic appeal. In this paper, we motivate the problem with a discussion on modern social media applications, introduce a dataset of 277 relevant images and human labels, and evaluate the efficacy of several single-crop models with an image partitioning algorithm as a pre-processing step. The dataset is available at https://github.com/RafeLoya/carousel.

Paper Structure

This paper contains 14 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Motivating example of multi-target image cropping.
  • Figure 2: Comparison of highest-scoring crops from GAICv2 on an image and its partitions, derived from the multi-region saliency partitioning algorithm. Note the significant overlap present in the crops predicted for (a), as the model is designed to produce a single optimized crop.
  • Figure 3: Example failure case for our partitioning algorithm.
  • Figure 4: Visual comparison of multi-target crops on our dataset. (b) shows the multi-view outputs of VPN on the original images, while (c)-(f) use our multi-region saliency partitioning algorithm (\ref{['sec:partitioning']}) followed by the single-target cropping models.