Uplifting Lower-Income Data: Strategies for Socioeconomic Perspective Shifts in Large Multi-modal Models

Joan Nwatu; Oana Ignat; Rada Mihalcea

Uplifting Lower-Income Data: Strategies for Socioeconomic Perspective Shifts in Large Multi-modal Models

Joan Nwatu, Oana Ignat, Rada Mihalcea

TL;DR

This work proposes and evaluates several prompting strategies using non-English, geographic, and socioeconomic attributes that favor retrieving topic appearances commonly found in data from low-income households across different countries leading to improved LMM model performance on lower-income data.

Abstract

Recent work has demonstrated that the unequal representation of cultures and socioeconomic groups in training data leads to biased Large Multi-modal (LMM) models. To improve LMM model performance on underrepresented data, we propose and evaluate several prompting strategies using non-English, geographic, and socioeconomic attributes. We show that these geographic and socioeconomic integrated prompts favor retrieving topic appearances commonly found in data from low-income households across different countries leading to improved LMM model performance on lower-income data. Our analyses identify and highlight contexts where these strategies yield the most improvements.

Uplifting Lower-Income Data: Strategies for Socioeconomic Perspective Shifts in Large Multi-modal Models

TL;DR

Abstract

Paper Structure (40 sections, 7 figures, 14 tables)

This paper contains 40 sections, 7 figures, 14 tables.

Introduction
Related Work
Addressing AI Performance Inequality.
Multilingual AI Models.
Prompting AI Models.
Methodology
Dollar Street Dataset
Image Income Classes.
Country Economic Classes.
Topic Representations.
Prompt Design
Default English Topic Prompt.
Translated Topic Prompt.
Country Suffix Topic Prompt.
Income Suffix Topic Prompt.
...and 25 more sections

Figures (7)

Figure 1: Low-income Image Retrieval from Dollar Street dataset Rojas2022TheDS using different prompt formulations. Prompts with integrated country and income information successfully retrieve fewer standard images previously left out by the English and translated (French) prompts.
Figure 2: NLLB SigLIP Recall (%) over poor and lower-middle income images from four countries, one from each of the four continents: Asia, Africa, America, and Europe for English and native translated prompts. Best viewed in color.
Figure 3: Recall scores for lower income images from 39 countries and 28 languages. The cyan highlight shows the Recall for a country's native translated language, the yellow highlight shows the best-performing language recall, and the red shows the Recall for the language that is both the native and highest performing for that country. Best viewed in color.
Figure 4: Recall (%) with NLLB SigLIP over poor and lower-middle income images from four countries from Asia, Africa, America, and Europe, for English and Country Suffix prompts. Best viewed in color.
Figure 5: Average Recall with NLLB SigLIP over poor and lower-middle income images, for English and Income Suffix prompts. Best viewed in color.
...and 2 more figures

Uplifting Lower-Income Data: Strategies for Socioeconomic Perspective Shifts in Large Multi-modal Models

TL;DR

Abstract

Uplifting Lower-Income Data: Strategies for Socioeconomic Perspective Shifts in Large Multi-modal Models

Authors

TL;DR

Abstract

Table of Contents

Figures (7)