Table of Contents
Fetching ...

PODTILE: Facilitating Podcast Episode Browsing with Auto-generated Chapters

Azin Ghazimatin, Ekaterina Garmash, Gustavo Penha, Kristen Sheets, Martin Achenbach, Oguz Semerci, Remi Galvez, Marcus Tannenberg, Sahitya Mantravadi, Divya Narayanan, Ofeliya Kalaydzhyan, Douglas Cole, Ben Carterette, Ann Clifton, Paul N. Bennett, Claudia Hauff, Mounia Lalmas

TL;DR

PODTILE, a fine-tuned encoder-decoder transformer to segment conversational data that indicates that auto-generated chapters serve as a useful tool for engaging with less popular podcasts, and empirical evidence that using chapter titles can enhance the effectiveness of sparse retrieval in search tasks.

Abstract

Listeners of long-form talk-audio content, such as podcast episodes, often find it challenging to understand the overall structure and locate relevant sections. A practical solution is to divide episodes into chapters--semantically coherent segments labeled with titles and timestamps. Since most episodes on our platform at Spotify currently lack creator-provided chapters, automating the creation of chapters is essential. Scaling the chapterization of podcast episodes presents unique challenges. First, episodes tend to be less structured than written texts, featuring spontaneous discussions with nuanced transitions. Second, the transcripts are usually lengthy, averaging about 16,000 tokens, which necessitates efficient processing that can preserve context. To address these challenges, we introduce PODTILE, a fine-tuned encoder-decoder transformer to segment conversational data. The model simultaneously generates chapter transitions and titles for the input transcript. To preserve context, each input text is augmented with global context, including the episode's title, description, and previous chapter titles. In our intrinsic evaluation, PODTILE achieved an 11% improvement in ROUGE score over the strongest baseline. Additionally, we provide insights into the practical benefits of auto-generated chapters for listeners navigating episode content. Our findings indicate that auto-generated chapters serve as a useful tool for engaging with less popular podcasts. Finally, we present empirical evidence that using chapter titles can enhance effectiveness of sparse retrieval in search tasks.

PODTILE: Facilitating Podcast Episode Browsing with Auto-generated Chapters

TL;DR

PODTILE, a fine-tuned encoder-decoder transformer to segment conversational data that indicates that auto-generated chapters serve as a useful tool for engaging with less popular podcasts, and empirical evidence that using chapter titles can enhance the effectiveness of sparse retrieval in search tasks.

Abstract

Listeners of long-form talk-audio content, such as podcast episodes, often find it challenging to understand the overall structure and locate relevant sections. A practical solution is to divide episodes into chapters--semantically coherent segments labeled with titles and timestamps. Since most episodes on our platform at Spotify currently lack creator-provided chapters, automating the creation of chapters is essential. Scaling the chapterization of podcast episodes presents unique challenges. First, episodes tend to be less structured than written texts, featuring spontaneous discussions with nuanced transitions. Second, the transcripts are usually lengthy, averaging about 16,000 tokens, which necessitates efficient processing that can preserve context. To address these challenges, we introduce PODTILE, a fine-tuned encoder-decoder transformer to segment conversational data. The model simultaneously generates chapter transitions and titles for the input transcript. To preserve context, each input text is augmented with global context, including the episode's title, description, and previous chapter titles. In our intrinsic evaluation, PODTILE achieved an 11% improvement in ROUGE score over the strongest baseline. Additionally, we provide insights into the practical benefits of auto-generated chapters for listeners navigating episode content. Our findings indicate that auto-generated chapters serve as a useful tool for engaging with less popular podcasts. Finally, we present empirical evidence that using chapter titles can enhance effectiveness of sparse retrieval in search tasks.

Paper Structure

This paper contains 16 sections, 3 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Chapters (purple circles) for (a) an episode about training tips vs. (b) a structured Wikipedia article about training. The episode chapters have short tangential discussions (gray circles), shared context (Peter’s experience), and a consistent title style. In contrast, Wikipedia chapters focus on the main topic with short titles that lack global context.
  • Figure 2: The input and output formatting of the chapterization model. The dotted box is the input to the core model.
  • Figure 3: Ratio of relative chapters engagement between episodes with auto-generated titles and creator-provided titles, plotted as the moving average over previous 7 days.
  • Figure 4: Percentage of creator-provided and auto-generated chapter plays across five user groups based on consumption.