Crossing Boundaries: Leveraging Semantic Divergences to Explore Cultural Novelty in Cooking Recipes
Florian Carichon, Romain Rampa, Golnoosh Farnadi
TL;DR
The paper addresses measuring cultural novelty in text by bridging sociology and NLP through divergence-based metrics. It introduces GlobalFusion, a dataset of 500 dishes and ~100k recipes across 150+ countries, and five information-theoretic cultural novelty metrics grounded in distribution divergence. The approach uses a culturally oriented knowledge space and Jensen–Shannon divergence to quantify Newness, Uniqueness, Difference, and Surprise, and validates the metrics against multiple cultural-distance measures, achieving significant although modest correlations. The work offers a framework for interpreting linguistic variations as proxies for cultural differences and provides tools for evaluating culturally aware AI systems, including LLMs, with potential for broad generalization across domains.
Abstract
Novelty modeling and detection is a core topic in Natural Language Processing (NLP), central to numerous tasks such as recommender systems and automatic summarization. It involves identifying pieces of text that deviate in some way from previously known information. However, novelty is also a crucial determinant of the unique perception of relevance and quality of an experience, as it rests upon each individual's understanding of the world. Social factors, particularly cultural background, profoundly influence perceptions of novelty and innovation. Cultural novelty arises from differences in salience and novelty as shaped by the distance between distinct communities. While cultural diversity has garnered increasing attention in artificial intelligence (AI), the lack of robust metrics for quantifying cultural novelty hinders a deeper understanding of these divergences. This gap limits quantifying and understanding cultural differences within computational frameworks. To address this, we propose an interdisciplinary framework that integrates knowledge from sociology and management. Central to our approach is GlobalFusion, a novel dataset comprising 500 dishes and approximately 100,000 cooking recipes capturing cultural adaptation from over 150 countries. By introducing a set of Jensen-Shannon Divergence metrics for novelty, we leverage this dataset to analyze textual divergences when recipes from one community are modified by another with a different cultural background. The results reveal significant correlations between our cultural novelty metrics and established cultural measures based on linguistic, religious, and geographical distances. Our findings highlight the potential of our framework to advance the understanding and measurement of cultural diversity in AI.
