Table of Contents
Fetching ...

NRC VAD Lexicon v2: Norms for Valence, Arousal, and Dominance for over 55k English Terms

Saif M. Mohammad

TL;DR

The paper presents the NRC VAD Lexicon v2, a large, freely available resource providing Valence, Arousal, and Dominance ratings for over 55,000 English terms and phrases, including about 25,000 new words and 10,000 multi-word expressions. It describes a crowdsourced annotation pipeline with rigorous quality control, mapping responses from a $-3$ to $3$ scale to the final $[-1,1]$ VAD scores, and reports high reliability with $\rho$ and $r$ exceeding $0.95$ across dimensions. The methodology combines diverse term sources (including prevalence-based unigrams and MWEs) with IRB-approved data collection and extensive QC, resulting in a robust resource for NLP, psychology, and digital humanities. The lexicon's broad coverage and reliability support a wide range of research and applications, while acknowledging limitations related to language variety, domain sense, and socio-cultural biases; it is released to the research community to facilitate further work on affective word representations.

Abstract

Factor analysis studies have shown that the primary dimensions of word meaning are Valence (V), Arousal (A), and Dominance (D) (also referred to in social cognition research as Competence (C)). These dimensions impact various aspects of our lives from social competence and emotion regulation to success in the work place and how we view the world. We present here the NRC VAD Lexicon v2, which has human ratings of valence, arousal, and dominance for more than 55,000 English words and phrases. Notably, it adds entries for $\sim$25k additional words to v1.0. It also now includes for the first time entries for common multi-word phrases (~10k). We show that the associations are highly reliable. The lexicon enables a wide variety of research in psychology, NLP, public health, digital humanities, and social sciences. The NRC VAD Lexicon v2 is made freely available for research through our project webpage.

NRC VAD Lexicon v2: Norms for Valence, Arousal, and Dominance for over 55k English Terms

TL;DR

The paper presents the NRC VAD Lexicon v2, a large, freely available resource providing Valence, Arousal, and Dominance ratings for over 55,000 English terms and phrases, including about 25,000 new words and 10,000 multi-word expressions. It describes a crowdsourced annotation pipeline with rigorous quality control, mapping responses from a to scale to the final VAD scores, and reports high reliability with and exceeding across dimensions. The methodology combines diverse term sources (including prevalence-based unigrams and MWEs) with IRB-approved data collection and extensive QC, resulting in a robust resource for NLP, psychology, and digital humanities. The lexicon's broad coverage and reliability support a wide range of research and applications, while acknowledging limitations related to language variety, domain sense, and socio-cultural biases; it is released to the research community to facilitate further work on affective word representations.

Abstract

Factor analysis studies have shown that the primary dimensions of word meaning are Valence (V), Arousal (A), and Dominance (D) (also referred to in social cognition research as Competence (C)). These dimensions impact various aspects of our lives from social competence and emotion regulation to success in the work place and how we view the world. We present here the NRC VAD Lexicon v2, which has human ratings of valence, arousal, and dominance for more than 55,000 English words and phrases. Notably, it adds entries for 25k additional words to v1.0. It also now includes for the first time entries for common multi-word phrases (~10k). We show that the associations are highly reliable. The lexicon enables a wide variety of research in psychology, NLP, public health, digital humanities, and social sciences. The NRC VAD Lexicon v2 is made freely available for research through our project webpage.

Paper Structure

This paper contains 10 sections, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Valence Questionnaire: Detailed instructions.
  • Figure 2: Valence Questionnaire: Sample question.
  • Figure 3: Valence Questionnaire: Examples.
  • Figure 4: Arousal Questionnaire: Detailed instructions.
  • Figure 5: Arousal Questionnaire: Sample question.
  • ...and 4 more figures