Table of Contents
Fetching ...

StandUp4AI: A New Multilingual Dataset for Humor Detection in Stand-up Comedy Videos

Valentin Barriere, Nahuel Gomez, Leo Hemamou, Sofia Callejas, Brian Ravenet

TL;DR

StandUp4AI introduces the largest multilingual Stand-up humor dataset to date, spanning seven languages and about 334 hours, with automatic laughter annotations refined by a novel ASR-based error-correction pipeline. The authors reframe humor detection as a word-level sequence labeling task to capture continuous laughter events, and validate this approach with an ASR-based laughter detector and a Random Forest candidate validator. They provide baseline results using a cross-lingual transformer and show gains from ASR-enhanced data and multilingual training, highlighting the dataset's potential to advance cross-language humor modeling. This work offers a publicly available resource and practical baselines to foster reproducibility and future multimodal extensions for humor-aware interactive systems.

Abstract

Aiming towards improving current computational models of humor detection, we propose a new multimodal dataset of stand-up comedies, in seven languages: English, French, Spanish, Italian, Portuguese, Hungarian and Czech. Our dataset of more than 330 hours, is at the time of writing the biggest available for this type of task, and the most diverse. The whole dataset is automatically annotated in laughter (from the audience), and the subpart left for model validation is manually annotated. Contrary to contemporary approaches, we do not frame the task of humor detection as a binary sequence classification, but as word-level sequence labeling, in order to take into account all the context of the sequence and to capture the continuous joke tagging mechanism typically occurring in natural conversations. As par with unimodal baselines results, we propose a method for e propose a method to enhance the automatic laughter detection based on Audio Speech Recognition errors. Our code and data are available online: https://tinyurl.com/EMNLPHumourStandUpPublic

StandUp4AI: A New Multilingual Dataset for Humor Detection in Stand-up Comedy Videos

TL;DR

StandUp4AI introduces the largest multilingual Stand-up humor dataset to date, spanning seven languages and about 334 hours, with automatic laughter annotations refined by a novel ASR-based error-correction pipeline. The authors reframe humor detection as a word-level sequence labeling task to capture continuous laughter events, and validate this approach with an ASR-based laughter detector and a Random Forest candidate validator. They provide baseline results using a cross-lingual transformer and show gains from ASR-enhanced data and multilingual training, highlighting the dataset's potential to advance cross-language humor modeling. This work offers a publicly available resource and practical baselines to foster reproducibility and future multimodal extensions for humor-aware interactive systems.

Abstract

Aiming towards improving current computational models of humor detection, we propose a new multimodal dataset of stand-up comedies, in seven languages: English, French, Spanish, Italian, Portuguese, Hungarian and Czech. Our dataset of more than 330 hours, is at the time of writing the biggest available for this type of task, and the most diverse. The whole dataset is automatically annotated in laughter (from the audience), and the subpart left for model validation is manually annotated. Contrary to contemporary approaches, we do not frame the task of humor detection as a binary sequence classification, but as word-level sequence labeling, in order to take into account all the context of the sequence and to capture the continuous joke tagging mechanism typically occurring in natural conversations. As par with unimodal baselines results, we propose a method for e propose a method to enhance the automatic laughter detection based on Audio Speech Recognition errors. Our code and data are available online: https://tinyurl.com/EMNLPHumourStandUpPublic

Paper Structure

This paper contains 25 sections, 1 figure, 7 tables.

Figures (1)

  • Figure 1: Overview of humor detection modeled as a sequence labeling task, and the method relying on complementary errors from the ASR outputs. Omine2024 model detected no laughter. Video https://youtu.be/OxvCVuGQ-uk?feature=shared&t=42