Finding Culture-Sensitive Neurons in Vision-Language Models
Xiutian Zhao, Rochelle Choenni, Rohit Saxena, Ivan Titov
TL;DR
This work reveals that vision-language models harbor culture-sensitive neurons whose activations are preferentially tied to specific cultures. It introduces Contrastive Activation Selection (CAS), a margin-based method that more precisely identifies these neurons than probability- or entropy-based approaches, and demonstrates their causal role through targeted ablations in three VLMs across 25 cultures on CVQA. Layer-wise analysis shows these neurons concentrate in the decoder, particularly in early and mid-layers, suggesting localized cultural knowledge representation. The findings offer a path toward targeted interventions to improve cultural alignment and fairness in multimodal models without full-scale retraining, and motivate activation-steering strategies for bias mitigation. Overall, the study advances our understanding of how multimodal models encode culturally grounded knowledge and where to intervene to refine their cross-cultural behavior.
Abstract
Despite their impressive performance, vision-language models (VLMs) still struggle on culturally situated inputs. To understand how VLMs process culturally grounded information, we study the presence of culture-sensitive neurons, i.e. neurons whose activations show preferential sensitivity to inputs associated with particular cultural contexts. We examine whether such neurons are important for culturally diverse visual question answering and where they are located. Using the CVQA benchmark, we identify neurons of culture selectivity and perform causal tests by deactivating the neurons flagged by different identification methods. Experiments on three VLMs across 25 cultural groups demonstrate the existence of neurons whose ablation disproportionately harms performance on questions about the corresponding cultures, while having minimal effects on others. Moreover, we propose a new margin-based selector - Contrastive Activation Selection (CAS), and show that it outperforms existing probability- and entropy-based methods in identifying culture-sensitive neurons. Finally, our layer-wise analyses reveals that such neurons tend to cluster in certain decoder layers. Overall, our findings shed new light on the internal organization of multimodal representations.
