Table of Contents
Fetching ...

From Trial by Fire To Sleep Like a Baby: A Lexicon of Anxiety Associations for 20k English Multiword Expressions

Saif M. Mohammad

TL;DR

The study addresses the gap in anxiety research for multiword expressions by introducing WorryMWEs, a large-scale lexicon of anxiety and calmness associations for over 20k English MWEs, released as part of WorryLex v2 alongside WorryWords. It relies on crowdsourced annotations with a $-3$ to $3$ scale, demonstrating high reliability and facilitating analyses of how anxiety manifests across word types, MWE types, and different n-gram lengths. The authors reveal both compositional and noncompositional aspects of MWE anxiety, showing that idioms and longer MWEs are major vectors of anxious associations, while many MWEs derive their anxiety from noncompositional usage. The resource offers broad utility for psychology, NLP, public health, and social sciences, enabling nuanced affective analysis and informing future multilingual expansion and cross-cultural studies; the data are freely available for research use, with careful attention to ethical considerations.

Abstract

Anxiety is the unease about a possible future negative outcome. In recent years, there has been growing interest in understanding how anxiety relates to our health, well-being, body, mind, and behaviour. This includes work on lexical resources for word-anxiety association. However, there is very little anxiety-related work on larger units of text such as multiword expressions (MWE). Here, we introduce the first large-scale lexicon capturing descriptive norms of anxiety associations for more than 20k English MWEs. We show that the anxiety associations are highly reliable. We use the lexicon to study prevalence of different types of anxiety- and calmness-associated MWEs; and how that varies across two-, three-, and four-word sequences. We also study the extent to which the anxiety association of MWEs is compositional (due to its constituent words). The lexicon enables a wide variety of anxiety-related research in psychology, NLP, public health, and social sciences. The lexicon is freely available: https://saifmohammad.com/worrylex.html

From Trial by Fire To Sleep Like a Baby: A Lexicon of Anxiety Associations for 20k English Multiword Expressions

TL;DR

The study addresses the gap in anxiety research for multiword expressions by introducing WorryMWEs, a large-scale lexicon of anxiety and calmness associations for over 20k English MWEs, released as part of WorryLex v2 alongside WorryWords. It relies on crowdsourced annotations with a to scale, demonstrating high reliability and facilitating analyses of how anxiety manifests across word types, MWE types, and different n-gram lengths. The authors reveal both compositional and noncompositional aspects of MWE anxiety, showing that idioms and longer MWEs are major vectors of anxious associations, while many MWEs derive their anxiety from noncompositional usage. The resource offers broad utility for psychology, NLP, public health, and social sciences, enabling nuanced affective analysis and informing future multilingual expansion and cross-cultural studies; the data are freely available for research use, with careful attention to ethical considerations.

Abstract

Anxiety is the unease about a possible future negative outcome. In recent years, there has been growing interest in understanding how anxiety relates to our health, well-being, body, mind, and behaviour. This includes work on lexical resources for word-anxiety association. However, there is very little anxiety-related work on larger units of text such as multiword expressions (MWE). Here, we introduce the first large-scale lexicon capturing descriptive norms of anxiety associations for more than 20k English MWEs. We show that the anxiety associations are highly reliable. We use the lexicon to study prevalence of different types of anxiety- and calmness-associated MWEs; and how that varies across two-, three-, and four-word sequences. We also study the extent to which the anxiety association of MWEs is compositional (due to its constituent words). The lexicon enables a wide variety of anxiety-related research in psychology, NLP, public health, and social sciences. The lexicon is freely available: https://saifmohammad.com/worrylex.html
Paper Structure (11 sections, 6 figures, 1 table)

This paper contains 11 sections, 6 figures, 1 table.

Figures (6)

  • Figure 1: Distribution of terms in (a) WorryWords (unigrams) and (b) WorryMWEs (MWEs): percentage of terms associated with each class. (The total number is shown in parenthesis.)
  • Figure 2: Unigrams - The distribution of grammatical categories in WorryWords (unigrams) (a) and the distribution of anxiety--calmness classes within the grammatical categories of WorryWords (b).
  • Figure 3: MWEs - The distribution of types of MWEs in WorryMWEs (a) and the distribution of anxiety--calmness classes within types of MWEs (b).
  • Figure 4: Measures of Anxiousness Compositionality. The percentages shown in b and c are for the 8,323 bigram MWEs considered.
  • Figure 5: Distribution of Types of MWEs across different ngrams in WorryMWEs: Percentage of terms of a given ngram (bigram, trigram, or fourgram) associated with each MWE type. (The total number is shown in parenthesis.)
  • ...and 1 more figures