Table of Contents
Fetching ...

PodReels: Human-AI Co-Creation of Video Podcast Teasers

Sitong Wang, Zheng Ning, Anh Truong, Mira Dontcheva, Dingzeyu Li, Lydia B. Chilton

TL;DR

PodReels tackles the difficulty of producing engaging short teasers from hour-long video podcast episodes by introducing a human-AI co-creative workflow embedded in Premiere Pro. It combines transcript-based extraction, speaker diarization, LLM-guided clip suggestions, and sentence-level refinement to assemble and polish teasers. A formative study informs design goals; a technical evaluation demonstrates high accuracy of LLM-based clip extraction and a user study shows significant reductions in mental demand and time, with increased perceived quality. The work argues that integrating AI within professional creative tools can improve both the quantity and quality of teaser outputs while preserving creator control and brand integrity.

Abstract

Video podcast teasers are short videos that can be shared on social media platforms to capture interest in the full episodes of a video podcast. These teasers enable long-form podcasters to reach new audiences and gain new followers. However, creating a compelling teaser from an hour-long episode is challenging. Selecting interesting clips requires significant mental effort; editing the chosen clips into a cohesive, well-produced teaser is time-consuming. To support the creation of video podcast teasers, we first investigate what makes a good teaser. We combine insights from both audience comments and creator interviews to determine a set of essential ingredients. We also identify a common workflow shared by creators during the process. Based on these findings, we introduce a human-AI co-creative tool called PodReels to assist video podcasters in creating teasers. Our user study shows that PodReels significantly reduces creators' mental demand and improves their efficiency in producing video podcast teasers.

PodReels: Human-AI Co-Creation of Video Podcast Teasers

TL;DR

PodReels tackles the difficulty of producing engaging short teasers from hour-long video podcast episodes by introducing a human-AI co-creative workflow embedded in Premiere Pro. It combines transcript-based extraction, speaker diarization, LLM-guided clip suggestions, and sentence-level refinement to assemble and polish teasers. A formative study informs design goals; a technical evaluation demonstrates high accuracy of LLM-based clip extraction and a user study shows significant reductions in mental demand and time, with increased perceived quality. The work argues that integrating AI within professional creative tools can improve both the quantity and quality of teaser outputs while preserving creator control and brand integrity.

Abstract

Video podcast teasers are short videos that can be shared on social media platforms to capture interest in the full episodes of a video podcast. These teasers enable long-form podcasters to reach new audiences and gain new followers. However, creating a compelling teaser from an hour-long episode is challenging. Selecting interesting clips requires significant mental effort; editing the chosen clips into a cohesive, well-produced teaser is time-consuming. To support the creation of video podcast teasers, we first investigate what makes a good teaser. We combine insights from both audience comments and creator interviews to determine a set of essential ingredients. We also identify a common workflow shared by creators during the process. Based on these findings, we introduce a human-AI co-creative tool called PodReels to assist video podcasters in creating teasers. Our user study shows that PodReels significantly reduces creators' mental demand and improves their efficiency in producing video podcast teasers.
Paper Structure (58 sections, 4 figures, 6 tables)

This paper contains 58 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: The review step helps creators pick the right moment for their teaser. (A) Based on user-provided queries (e.g., desired length), we show a list of candidate moments for users to scroll and review. For each moment, we provide a compact yet informative view to help users make a decision. (B) We show an AI-generated tagline for each moment and include a duration. (C) We also show who the active speakers are, along with a liveliness score computed from audio features. We also verify whether the input keywords are present in the moment. (D) Finally, we show an expandable full transcript view, allowing users to glance through.
  • Figure 2: In the refine step, we show a sentence-based view. (A) By default, the center portion, which represents the sentences within the selected moment, are selected. (B) Users can toggle on or off each sentence to include or exclude from the teaser. (C) To provide better context, users can pick from the before and after surrounding sentences and add them to the teaser. (D) Users can reorder the sentences to improve the flow. (E) If users want to add a sentence that is not listed, they can use the search box to find the sentence from the full transcript and drop it to insert. (F) Users can also remove all filler words with a toggle switch.
  • Figure 3: The music step allows users to easily add music to their teaser to make it more engaging. (A) Users can pick a preset style from the list. (B) The system automatically identifies the peak (i.e., the emphasis point) in the transcript and highlights it with a star icon. (C) If users want to change the emphasis point, they can click on the new sentence. (D) Users can preview the teaser with music. Under the hood, PodReels has added and arranged the music to align with the emphasis point. For example, the system will insert the music’s intro/verse during non-emphasized sentences and insert a chorus on the emphasized sentence.
  • Figure 4: Screenshot of the baseline system. The baseline has identical transitions, music, and finish features as of PodReels for production support. While for the content assembly part, it only provides basic transcript-based selection and preview functionalities.