Evaluating Keyframe Layouts for Visual Known-Item Search in Homogeneous Collections

Bastian Jäckl; Jiří Kruchina; Lucas Joos; Daniel A. Keim; Ladislav Peška; Jakub Lokoč

Evaluating Keyframe Layouts for Visual Known-Item Search in Homogeneous Collections

Bastian Jäckl, Jiří Kruchina, Lucas Joos, Daniel A. Keim, Ladislav Peška, Jakub Lokoč

TL;DR

This work investigates how keyframe grid layouts influence browsing efficiency and accuracy in Visual Known-Item Search on a large-scale, homogeneous MVK dataset. It compares seven layouts (four ranked, two sorted, one grouped) using a within-subject design with $|C_r|=200$ candidates across $|P|=49$ participants and $1715$ tasks, analyzing efficiency, accuracy, and browsing behaviors like region skipping and overlooks. The study finds that a video-grouped layout (V8) is fastest overall, while a four-column rank-preserving grid (G4 lp) provides the best accuracy; sorted and grouped layouts enable efficient exclusion of large regions but incur higher first-arrival times and overlooks. The findings motivate hybrid designs that preserve top-ranked item positions while sorting or grouping the remainder, with broader implications for grid-based search interfaces beyond video retrieval.

Abstract

Multimodal deep-learning models power interactive video retrieval by ranking keyframes in response to textual queries. Despite these advances, users must still browse ranked candidates manually to locate a target. Keyframe arrangement within the search grid highly affects browsing effectiveness and user efficiency, yet remains underexplored. We report a study with 49 participants evaluating seven keyframe layouts for the Visual Known-Item Search task. Beyond efficiency and accuracy, we relate browsing phenomena, such as overlooks, to layout characteristics. Our results show that a video-grouped layout is the most efficient, while a four-column, rank-preserving grid achieves the highest accuracy. Sorted grids reveal potentials and trade-offs, enabling rapid scanning of uninteresting regions but down-ranking relevant targets to less prominent positions, delaying first arrival times and increasing overlooks. These findings motivate hybrid designs that preserve positions of top-ranked items while sorting or grouping the remainder, and offer guidance for searching in grids beyond video retrieval.

Evaluating Keyframe Layouts for Visual Known-Item Search in Homogeneous Collections

TL;DR

Abstract

Evaluating Keyframe Layouts for Visual Known-Item Search in Homogeneous Collections

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)