Using Saliency and Cropping to Improve Video Memorability
Vaibhav Mudgal, Qingyang Wang, Lorin Sweeney, Alan F. Smeaton
TL;DR
This paper investigates whether saliency-guided frame cropping can actively enhance video memorability for short clips. It combines a CLIP-based memorability predictor with three saliency-based cropping strategies, using DeepGaze IIE to produce saliency maps and a Bayesian Ridge Regressor to score memorability, evaluated on a 1,500-video subset of the Memento10k dataset. The key finding is that cropping can improve memorability primarily for videos with low initial memorability, with fixed and variable saliency tracking offering similar benefits and diminishing returns for highly memorable videos. The work demonstrates a practical, lightweight approach to manipulating video memorability and points to future directions in more sophisticated visual manipulations to further boost memorability, especially for high-memorable content.
Abstract
Video memorability is a measure of how likely a particular video is to be remembered by a viewer when that viewer has no emotional connection with the video content. It is an important characteristic as videos that are more memorable are more likely to be shared, viewed, and discussed. This paper presents results of a series of experiments where we improved the memorability of a video by selectively cropping frames based on image saliency. We present results of a basic fixed cropping as well as the results from dynamic cropping where both the size of the crop and the position of the crop within the frame, move as the video is played and saliency is tracked. Our results indicate that especially for videos of low initial memorability, the memorability score can be improved.
