It's Just Another Day: Unique Video Captioning by Discriminative Prompting

Toby Perrett; Tengda Han; Dima Damen; Andrew Zisserman

It's Just Another Day: Unique Video Captioning by Discriminative Prompting

Toby Perrett, Tengda Han, Dima Damen, Andrew Zisserman

TL;DR

This paper formulate the problem of unique captioning: Given multiple clips with the same caption, a new caption is generated for each clip that uniquely identifies it, and proposes Captioning by Discriminative Prompting (CDP), which predicts a property that can separate identically captioned clips, and uses it to generate unique captions.

Abstract

Long videos contain many repeating actions, events and shots. These repetitions are frequently given identical captions, which makes it difficult to retrieve the exact desired clip using a text search. In this paper, we formulate the problem of unique captioning: Given multiple clips with the same caption, we generate a new caption for each clip that uniquely identifies it. We propose Captioning by Discriminative Prompting (CDP), which predicts a property that can separate identically captioned clips, and use it to generate unique captions. We introduce two benchmarks for unique captioning, based on egocentric footage and timeloop movies - where repeating actions are common. We demonstrate that captions generated by CDP improve text-to-video R@1 by 15% for egocentric videos and 10% in timeloop movies.

It's Just Another Day: Unique Video Captioning by Discriminative Prompting

TL;DR

Abstract

It's Just Another Day: Unique Video Captioning by Discriminative Prompting

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (15)