Table of Contents
Fetching ...

Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art

Chen Cecilia Liu, Iryna Gurevych, Anna Korhonen

TL;DR

The paper addresses the lack of a shared understanding of culture in NLP by proposing a fine-grained, anthropology-grounded taxonomy that organizes cultural variation into ideational, linguistic, and social elements. It surveys 127 publications from leading NLP venues, categorizing resources and methods according to the taxonomy to map progress and identify gaps. Key contributions include the taxonomy itself, a systematic survey of resources and modeling approaches, and practical recommendations for data collection, model adaptation, and evaluation that account for cultural diversity. The study highlights notable progress in knowledge and values resources while revealing gaps in multilingual coverage, social-contextual annotations, and deeper cultural modeling beyond surface adaptations, with implications for fairer and more inclusive NLP systems. This taxonomy and survey provide a framework for guiding future research and resource development toward culturally aware NLP with broader real-world impact.

Abstract

The surge of interest in "culture" in NLP has inspired much recent research, but a shared understanding of "culture" remains unclear, making it difficult to evaluate progress in this emerging area. Drawing on prior research in NLP and related fields, we propose a fine-grained taxonomy of elements in culture that can provide a systematic framework for analyzing and understanding research progress. Using the taxonomy, we survey existing resources and methods for culturally aware and adapted NLP, providing an overview of the state of the art and the research gaps that still need to be filled.

Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art

TL;DR

The paper addresses the lack of a shared understanding of culture in NLP by proposing a fine-grained, anthropology-grounded taxonomy that organizes cultural variation into ideational, linguistic, and social elements. It surveys 127 publications from leading NLP venues, categorizing resources and methods according to the taxonomy to map progress and identify gaps. Key contributions include the taxonomy itself, a systematic survey of resources and modeling approaches, and practical recommendations for data collection, model adaptation, and evaluation that account for cultural diversity. The study highlights notable progress in knowledge and values resources while revealing gaps in multilingual coverage, social-contextual annotations, and deeper cultural modeling beyond surface adaptations, with implications for fairer and more inclusive NLP systems. This taxonomy and survey provide a framework for guiding future research and resource development toward culturally aware NLP with broader real-world impact.

Abstract

The surge of interest in "culture" in NLP has inspired much recent research, but a shared understanding of "culture" remains unclear, making it difficult to evaluate progress in this emerging area. Drawing on prior research in NLP and related fields, we propose a fine-grained taxonomy of elements in culture that can provide a systematic framework for analyzing and understanding research progress. Using the taxonomy, we survey existing resources and methods for culturally aware and adapted NLP, providing an overview of the state of the art and the research gaps that still need to be filled.
Paper Structure (35 sections, 5 figures, 1 table)

This paper contains 35 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Papers title and abstract containing "culture" or "cultural" published in the main and findings of AACL/EACL/NAACL/ACL/EMNLP and TACL within 5 years, with normalized percentages based on the total number of papers at included venues to date.
  • Figure 2: An overview of the taxonomy with examples of subcategories of future possible expansions. The elements in culture are organized into three different branches: ideational, linguistic, and social. The ideational branch (§\ref{['sec:cult_elements']} encompasses the non-material aspects of culture that constitute a way of life. The linguistic branch (§\ref{['sec:forms']}) focuses on cultural variations in language and linguistic forms, bridging the ideational and social elements of culture. The social branch (§\ref{['sec:sociocult']}) covers key factors in social interaction and communication.
  • Figure 3: An illustration of surface versus deep culturally adapted NLP model.
  • Figure 4: Categorization of the methods for resource acquisitions with representative examples.
  • Figure 5: Categorization of the adaptation modelling methods and examples in each category.