Table of Contents
Fetching ...

EVOKE: Emotion Vocabulary Of Korean and English

Yoonwon Jung, Hagyeong Shin, Benjamin K. Bergen

TL;DR

EVOKE delivers a Korean-English parallel emotion vocabulary resource with exhaustive lexical coverage, many-to-many translation mappings, and explicit handling of polysemy and metaphor. The dataset comprises 1,427 Korean and 1,399 English words, with annotations on adjectives and verbs across a theory-agnostic framework, enabling diverse analyses from psycholinguistics to NLP. It provides translational mappings, lexical gaps, and structured senses to support cross-linguistic emotion research and practical emotion-detection applications. Publicly available at https://github.com/yoonwonj/EVOKE, EVOKE enables theory-driven word selection, cross-cultural comparisons of emotion concepts, and robust stimulus design for experiments and NLP tasks.

Abstract

This paper introduces EVOKE, a parallel dataset of emotion vocabulary in English and Korean. The dataset offers comprehensive coverage of emotion words in each language, in addition to many-to-many translations between words in the two languages and identification of language-specific emotion words. The dataset contains 1,427 Korean words and 1,399 English words, and we systematically annotate 819 Korean and 924 English adjectives and verbs. We also annotate multiple meanings of each word and their relationships, identifying polysemous emotion words and emotion-related metaphors. The dataset is, to our knowledge, the most comprehensive, systematic, and theory-agnostic dataset of emotion words in both Korean and English to date. It can serve as a practical tool for emotion science, psycholinguistics, computational linguistics, and natural language processing, allowing researchers to adopt different views on the resource reflecting their needs and theoretical perspectives. The dataset is publicly available at https://github.com/yoonwonj/EVOKE.

EVOKE: Emotion Vocabulary Of Korean and English

TL;DR

EVOKE delivers a Korean-English parallel emotion vocabulary resource with exhaustive lexical coverage, many-to-many translation mappings, and explicit handling of polysemy and metaphor. The dataset comprises 1,427 Korean and 1,399 English words, with annotations on adjectives and verbs across a theory-agnostic framework, enabling diverse analyses from psycholinguistics to NLP. It provides translational mappings, lexical gaps, and structured senses to support cross-linguistic emotion research and practical emotion-detection applications. Publicly available at https://github.com/yoonwonj/EVOKE, EVOKE enables theory-driven word selection, cross-cultural comparisons of emotion concepts, and robust stimulus design for experiments and NLP tasks.

Abstract

This paper introduces EVOKE, a parallel dataset of emotion vocabulary in English and Korean. The dataset offers comprehensive coverage of emotion words in each language, in addition to many-to-many translations between words in the two languages and identification of language-specific emotion words. The dataset contains 1,427 Korean words and 1,399 English words, and we systematically annotate 819 Korean and 924 English adjectives and verbs. We also annotate multiple meanings of each word and their relationships, identifying polysemous emotion words and emotion-related metaphors. The dataset is, to our knowledge, the most comprehensive, systematic, and theory-agnostic dataset of emotion words in both Korean and English to date. It can serve as a practical tool for emotion science, psycholinguistics, computational linguistics, and natural language processing, allowing researchers to adopt different views on the resource reflecting their needs and theoretical perspectives. The dataset is publicly available at https://github.com/yoonwonj/EVOKE.
Paper Structure (29 sections, 2 figures, 2 tables)

This paper contains 29 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Structure of the Korean--English parallel emotion-word dataset. Words in both languages are connected through many-to-many translational mappings, with separate annotations for Korean and English words.
  • Figure 2: The percentage of the annotation values for each annotation criterion in Korean and English adjectives. Percentages of poly13 and poly14 were calculated for annotations marked as having more than one meaning in poly12. Percentages of poly12 indicate the ratio of words annotated as having more than one meaning in each language. Error bars indicate 95% confidence interval.