VideoDiff: Human-AI Video Co-Creation with Alternatives
Mina Huh, Dingzeyu Li, Kim Pimmel, Hijung Valentina Shin, Amy Pavel, Mira Dontcheva
TL;DR
VideoDiff addresses the challenge of reviewing and selecting among numerous AI-generated video editing alternatives. It introduces a co-creative tool that generates multiple suggestions for rough cuts, B-rolls, and text effects, and provides aligned, multi-view diff visualizations (timeline, transcript) to support sensemaking. In a formative study and a within-subject user evaluation (N=12), VideoDiff reduced comparison time, lowered cognitive load, and increased user satisfaction and perceived usefulness for video authoring, with some participants expressing concerns about expressiveness and control. The work demonstrates the practical value of structured alternative management in video editing and outlines future directions toward broader editing tasks, multimodal inputs, personalization, and accessibility.
Abstract
To make an engaging video, people sequence interesting moments and add visuals such as B-rolls or text. While video editing requires time and effort, AI has recently shown strong potential to make editing easier through suggestions and automation. A key strength of generative models is their ability to quickly generate multiple variations, but when provided with many alternatives, creators struggle to compare them to find the best fit. We propose VideoDiff, an AI video editing tool designed for editing with alternatives. With VideoDiff, creators can generate and review multiple AI recommendations for each editing process: creating a rough cut, inserting B-rolls, and adding text effects. VideoDiff simplifies comparisons by aligning videos and highlighting differences through timelines, transcripts, and video previews. Creators have the flexibility to regenerate and refine AI suggestions as they compare alternatives. Our study participants (N=12) could easily compare and customize alternatives, creating more satisfying results.
