Table of Contents
Fetching ...

Finding Culture-Sensitive Neurons in Vision-Language Models

Xiutian Zhao, Rochelle Choenni, Rohit Saxena, Ivan Titov

TL;DR

This work reveals that vision-language models harbor culture-sensitive neurons whose activations are preferentially tied to specific cultures. It introduces Contrastive Activation Selection (CAS), a margin-based method that more precisely identifies these neurons than probability- or entropy-based approaches, and demonstrates their causal role through targeted ablations in three VLMs across 25 cultures on CVQA. Layer-wise analysis shows these neurons concentrate in the decoder, particularly in early and mid-layers, suggesting localized cultural knowledge representation. The findings offer a path toward targeted interventions to improve cultural alignment and fairness in multimodal models without full-scale retraining, and motivate activation-steering strategies for bias mitigation. Overall, the study advances our understanding of how multimodal models encode culturally grounded knowledge and where to intervene to refine their cross-cultural behavior.

Abstract

Despite their impressive performance, vision-language models (VLMs) still struggle on culturally situated inputs. To understand how VLMs process culturally grounded information, we study the presence of culture-sensitive neurons, i.e. neurons whose activations show preferential sensitivity to inputs associated with particular cultural contexts. We examine whether such neurons are important for culturally diverse visual question answering and where they are located. Using the CVQA benchmark, we identify neurons of culture selectivity and perform causal tests by deactivating the neurons flagged by different identification methods. Experiments on three VLMs across 25 cultural groups demonstrate the existence of neurons whose ablation disproportionately harms performance on questions about the corresponding cultures, while having minimal effects on others. Moreover, we propose a new margin-based selector - Contrastive Activation Selection (CAS), and show that it outperforms existing probability- and entropy-based methods in identifying culture-sensitive neurons. Finally, our layer-wise analyses reveals that such neurons tend to cluster in certain decoder layers. Overall, our findings shed new light on the internal organization of multimodal representations.

Finding Culture-Sensitive Neurons in Vision-Language Models

TL;DR

This work reveals that vision-language models harbor culture-sensitive neurons whose activations are preferentially tied to specific cultures. It introduces Contrastive Activation Selection (CAS), a margin-based method that more precisely identifies these neurons than probability- or entropy-based approaches, and demonstrates their causal role through targeted ablations in three VLMs across 25 cultures on CVQA. Layer-wise analysis shows these neurons concentrate in the decoder, particularly in early and mid-layers, suggesting localized cultural knowledge representation. The findings offer a path toward targeted interventions to improve cultural alignment and fairness in multimodal models without full-scale retraining, and motivate activation-steering strategies for bias mitigation. Overall, the study advances our understanding of how multimodal models encode culturally grounded knowledge and where to intervene to refine their cross-cultural behavior.

Abstract

Despite their impressive performance, vision-language models (VLMs) still struggle on culturally situated inputs. To understand how VLMs process culturally grounded information, we study the presence of culture-sensitive neurons, i.e. neurons whose activations show preferential sensitivity to inputs associated with particular cultural contexts. We examine whether such neurons are important for culturally diverse visual question answering and where they are located. Using the CVQA benchmark, we identify neurons of culture selectivity and perform causal tests by deactivating the neurons flagged by different identification methods. Experiments on three VLMs across 25 cultural groups demonstrate the existence of neurons whose ablation disproportionately harms performance on questions about the corresponding cultures, while having minimal effects on others. Moreover, we propose a new margin-based selector - Contrastive Activation Selection (CAS), and show that it outperforms existing probability- and entropy-based methods in identifying culture-sensitive neurons. Finally, our layer-wise analyses reveals that such neurons tend to cluster in certain decoder layers. Overall, our findings shed new light on the internal organization of multimodal representations.

Paper Structure

This paper contains 56 sections, 13 equations, 13 figures, 6 tables, 1 algorithm.

Figures (13)

  • Figure 1: An ablation example of Qwen2.5-VL-7B on India-Marathi VQA subset. Given an image of Tilgul, an Indian sweet made from sesame seeds and jaggery, the full model selects the ground truth-matched option; RND mask does not affect the model's decision, while LAPE and CAS masks redirect to different answers. Mentioned methods are explained in § \ref{['subsec:identification']}.
  • Figure 2: Pipeline for identifying and validating culture-sensitive neurons: (1) record neuron activations on culture-specific VQAs, (2) identify influential neurons using several methods, and (3) evaluate their importance by ablating the top-$r\%$ neurons and measuring the effect on accuracy and answer divergence.
  • Figure 3: Unablated full models per-culture accuracy on CVQA. Distribution of per-culture accuracies for the three models on the identification split (marked in dots) and the evaluation split (marked in solid color). The full table of per-culture results appears in Appendix \ref{['appendix:full_unmasked']}.
  • Figure 4: Accuracy change $\Delta$ (top) and flip rate heatmaps (bottom) on CVQA for different identification methods (Qwen2.5-VL-7B; showing first ten cultures). On the y-axis we have the source culture for which neurons are identified and ablated and on the x-axis the culture used for evaluation. We report percentage changes relative to the unablated full model. Diagonal cells show self-deactivation results. Results for all culture pairs are in Appendix \ref{['subsec:full_matrices_qwen']}.
  • Figure 5: Layer-wise counts of identified neurons by different methods (Qwen2.5-VL-7B; log-scaled color).
  • ...and 8 more figures