Table of Contents
Fetching ...

A new kid on the block: Distributional semantics predicts the word-specific tone signatures of monosyllabic words in conversational Taiwan Mandarin

Xiaoyun Jin, Mirjam Ernestus, R. Harald Baayen

TL;DR

The paper addresses how word meaning influences the realization of pitch contours in spontaneous Taiwan Mandarin monosyllables, challenging the view that canonical tones alone govern tone realization. Using GAM to decompose F0 into components tied to tone pattern, word, and word sense, the study shows robust semantic effects that often surpass canonical tone contributions. Heterographic homophones exhibit distinct pitch signatures, and pitch contours can be predicted from contextualized embeddings, supporting a semantic basis for tone realization within the Discriminative Lexicon Model. Collectively, the findings argue for a semantic, distributional-semantic approach to Mandarin tone and demonstrate the utility of contextualized embeddings in phonetic prediction, with implications for theories of tone and lexicon-phonology interaction.

Abstract

We present a corpus-based investigation of how the pitch contours of monosyllabic words are realized in spontaneous conversational Mandarin, focusing on the effects of words' meanings. We used the generalized additive model to decompose a given observed pitch contour into a set of component pitch contours that are tied to different control variables and semantic predictors. Even when variables such as word duration, gender, speaker identity, tonal context, vowel height, and utterance position are controlled for, the effect of word remains a strong predictor of tonal realization. We present evidence that this effect of word is a semantic effect: word sense is shown to be a better predictor than word, and heterographic homophones are shown to have different pitch contours. The strongest evidence for the importance of semantics is that the pitch contours of individual word tokens can be predicted from their contextualized embeddings with an accuracy that substantially exceeds a permutation baseline. For phonetics, distributional semantics is a new kid on the block. Although our findings challenge standard theories of Mandarin tone, they fit well within the theoretical framework of the Discriminative Lexicon Model.

A new kid on the block: Distributional semantics predicts the word-specific tone signatures of monosyllabic words in conversational Taiwan Mandarin

TL;DR

The paper addresses how word meaning influences the realization of pitch contours in spontaneous Taiwan Mandarin monosyllables, challenging the view that canonical tones alone govern tone realization. Using GAM to decompose F0 into components tied to tone pattern, word, and word sense, the study shows robust semantic effects that often surpass canonical tone contributions. Heterographic homophones exhibit distinct pitch signatures, and pitch contours can be predicted from contextualized embeddings, supporting a semantic basis for tone realization within the Discriminative Lexicon Model. Collectively, the findings argue for a semantic, distributional-semantic approach to Mandarin tone and demonstrate the utility of contextualized embeddings in phonetic prediction, with implications for theories of tone and lexicon-phonology interaction.

Abstract

We present a corpus-based investigation of how the pitch contours of monosyllabic words are realized in spontaneous conversational Mandarin, focusing on the effects of words' meanings. We used the generalized additive model to decompose a given observed pitch contour into a set of component pitch contours that are tied to different control variables and semantic predictors. Even when variables such as word duration, gender, speaker identity, tonal context, vowel height, and utterance position are controlled for, the effect of word remains a strong predictor of tonal realization. We present evidence that this effect of word is a semantic effect: word sense is shown to be a better predictor than word, and heterographic homophones are shown to have different pitch contours. The strongest evidence for the importance of semantics is that the pitch contours of individual word tokens can be predicted from their contextualized embeddings with an accuracy that substantially exceeds a permutation baseline. For phonetics, distributional semantics is a new kid on the block. Although our findings challenge standard theories of Mandarin tone, they fit well within the theoretical framework of the Discriminative Lexicon Model.

Paper Structure

This paper contains 13 sections, 12 figures, 1 table.

Figures (12)

  • Figure 1: Distribution of F0 values according to five different transformation methods.
  • Figure 2: Importance of control variables (left panel) and core predictors (right panel). The left-hand panel shows decrease in model fit, gauged by increase in AIC units when a control variable is excluded from the baseline model. The right-hand panel shows the AIC units improvement when a core predictor is added to the baseline model.
  • Figure 3: The partial effect of the interaction of log-transformed duration and normalized time for female (left) and male speakers (right). Darker colors indicate longer word durations. The number in the upper-left corner of each panel indicates the difference in Hz between the lowest and highest values, taking the intercepts for female and male speakers into account, and back-transforming from predicted log F0.
  • Figure 4: Partial effect of utterance position. Words later in an utterance tend to be produced with lower pitch, irrespective of whether the number of basis functions is set to 4 (left panel) or to 10 (right panel).
  • Figure 5: Partial effect of tone pattern in interaction with normalized time. Dashed lines denote the intercept for each tonal pattern predicted by the GAM. The number in the upper-right corner of each panel indicates the difference in Hz between the lowest and highest values.
  • ...and 7 more figures