Table of Contents
Fetching ...

A Functional Trade-off between Prosodic and Semantic Cues in Conveying Sarcasm

Zhu Li, Xiyuan Gao, Yuqing Zhang, Shekhar Nayak, Matt Coler

TL;DR

This study investigates how prosodic cues ($F0$, duration, and amplitude) and semantic incongruity interact to convey sarcasm across three subtypes (Embedded, Propositional, Illocutionary) using the MUStARD++ dataset. Through utterance- and key-phrase level analyses with linear mixed effects models, it finds that sarcasm generally lowers mean $F0$ and increases mean amplitude at the utterance level, while key-phrase prosody shows subtype-specific patterns, revealing a trade-off: semantically dense sarcasm relies less on prosody, whereas semantically sparse sarcasm leans on prosodic signaling. These findings emphasize the importance of combining coarse- and fine-grained prosodic analyses for accurate sarcasm detection and have potential implications for controllable speech synthesis and multimodal dialogue systems. The work advances understanding of how listeners integrate local prosody with semantic cues to interpret sarcastic intent in spontaneous speech.

Abstract

This study investigates the acoustic features of sarcasm and disentangles the interplay between the propensity of an utterance being used sarcastically and the presence of prosodic cues signaling sarcasm. Using a dataset of sarcastic utterances compiled from television shows, we analyze the prosodic features within utterances and key phrases belonging to three distinct sarcasm categories (embedded, propositional, and illocutionary), which vary in the degree of semantic cues present, and compare them to neutral expressions. Results show that in phrases where the sarcastic meaning is salient from the semantics, the prosodic cues are less relevant than when the sarcastic meaning is not evident from the semantics, suggesting a trade-off between prosodic and semantic cues of sarcasm at the phrase level. These findings highlight a lessened reliance on prosodic modulation in semantically dense sarcastic expressions and a nuanced interaction that shapes the communication of sarcastic intent.

A Functional Trade-off between Prosodic and Semantic Cues in Conveying Sarcasm

TL;DR

This study investigates how prosodic cues (, duration, and amplitude) and semantic incongruity interact to convey sarcasm across three subtypes (Embedded, Propositional, Illocutionary) using the MUStARD++ dataset. Through utterance- and key-phrase level analyses with linear mixed effects models, it finds that sarcasm generally lowers mean and increases mean amplitude at the utterance level, while key-phrase prosody shows subtype-specific patterns, revealing a trade-off: semantically dense sarcasm relies less on prosody, whereas semantically sparse sarcasm leans on prosodic signaling. These findings emphasize the importance of combining coarse- and fine-grained prosodic analyses for accurate sarcasm detection and have potential implications for controllable speech synthesis and multimodal dialogue systems. The work advances understanding of how listeners integrate local prosody with semantic cues to interpret sarcastic intent in spontaneous speech.

Abstract

This study investigates the acoustic features of sarcasm and disentangles the interplay between the propensity of an utterance being used sarcastically and the presence of prosodic cues signaling sarcasm. Using a dataset of sarcastic utterances compiled from television shows, we analyze the prosodic features within utterances and key phrases belonging to three distinct sarcasm categories (embedded, propositional, and illocutionary), which vary in the degree of semantic cues present, and compare them to neutral expressions. Results show that in phrases where the sarcastic meaning is salient from the semantics, the prosodic cues are less relevant than when the sarcastic meaning is not evident from the semantics, suggesting a trade-off between prosodic and semantic cues of sarcasm at the phrase level. These findings highlight a lessened reliance on prosodic modulation in semantically dense sarcastic expressions and a nuanced interaction that shapes the communication of sarcastic intent.
Paper Structure (11 sections, 2 figures, 2 tables)

This paper contains 11 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Utterance-level prosodic features across neutral (NONE) and the three sarcastic (EMB, PRO, and ILL) types.
  • Figure 2: Key phrase-level prosodic features of the three sarcasm types (EMB, PRO, and ILL). "1" indicates sarcastic key phrases, whereas "0" signals their non-sarcastic counterparts.