Identifying and interpreting non-aligned human conceptual representations using language modeling

Wanqian Bao; Uri Hasson

Identifying and interpreting non-aligned human conceptual representations using language modeling

Wanqian Bao, Uri Hasson

TL;DR

A supervised representational-alignment method is introduced that determines whether two groups of individuals share the same basis of a certain category, and in what respects they differ, and how blindness impacts conceptual representation of everyday verbs.

Abstract

The question of whether people's experience in the world shapes conceptual representation and lexical semantics is longstanding. Word-association, feature-listing and similarity rating tasks aim to address this question but require a subjective interpretation of the latent dimensions identified. In this study, we introduce a supervised representational-alignment method that (i) determines whether two groups of individuals share the same basis of a certain category, and (ii) explains in what respects they differ. In applying this method, we show that congenital blindness induces conceptual reorganization in both a-modal and sensory-related verbal domains, and we identify the associated semantic shifts. We first apply supervised feature-pruning to a language model (GloVe) to optimize prediction accuracy of human similarity judgments from word embeddings. Pruning identifies one subset of retained GloVe features that optimizes prediction of judgments made by sighted individuals and another subset that optimizes judgments made by blind. A linear probing analysis then interprets the latent semantics of these feature-subsets by learning a mapping from the retained GloVe features to 65 interpretable semantic dimensions. We applied this approach to seven semantic domains, including verbs related to motion, sight, touch, and amodal verbs related to knowledge acquisition. We find that blind individuals more strongly associate social and cognitive meanings to verbs related to motion or those communicating non-speech vocal utterances (e.g., whimper, moan). Conversely, for amodal verbs, they demonstrate much sparser information. Finally, for some verbs, representations of blind and sighted are highly similar. The study presents a formal approach for studying interindividual differences in word meaning, and the first demonstration of how blindness impacts conceptual representation of everyday verbs.

Identifying and interpreting non-aligned human conceptual representations using language modeling

TL;DR

Abstract

Paper Structure (18 sections, 3 equations, 4 figures, 3 tables, 1 algorithm)

This paper contains 18 sections, 3 equations, 4 figures, 3 tables, 1 algorithm.

Introduction
Conceptual representation in congenitally blind
Current approach: supervised learning of representation
Method
Datasets and word embeddings
Supervised pruning
Pruning algorithm
Overlap between feature sets retained by pruning
Supervised probing
Results
Preliminary evaluation: out of sample generalization of pruning solution
Correspondence between feature-sets retained by pruning
Probing: Information in retained feature sets
Discussion
Appendix
...and 3 more sections

Figures (4)

Figure 1: Analysis workflow consisting of supervised pruning and supervised probing. Solid lines indicate input datasets and dashed lines indicate products of analyses
Figure 1: Prediction accuracy for each of the 65 features in Binder et al., from each of the 14 retained features sets obtained via pruning
Figure 2: Panel A: Dice coefficients for different combinations of pruned sets. The size of the larger feature set was always matched to the size of the smaller one prior to computing the coefficient. p_a, p_s, p_t indicate Perception Action, Sight, and Touch verbs. e_l, e_sa, e_si indicate Emission of Light, Animate Sounds, and Inanimate Sounds. 'm' indicates Motion verbs. Panel B: The number of GloVe features ($sum=300$) that appeared in zero or more retained sets. No feature appeared in more than nine of the 14 sets.
Figure 3: Panel A: Correlations between predicted and ground truth ratings when retained feature sets were used as regressors for predicting Binder's words. p_a, p_s, p_t indicate Perception Action, Sight, and Touch verbs. e_l, e_sa, e_si indicate Emission of Light, Animate Sounds, and Inanimate Sounds. 'm' indicates Motion verbs. The top row (300d) shows prediction accuracy when all 300 GloVe dimensions are used in the PLSR model. Panel B: Clustering of experimental conditions by prediction accuracy for the 65 Binder features. Only in the case of Emission-InanimateSounds are congenitally blind and sighted positioned in adjacent terminal leaves.

Identifying and interpreting non-aligned human conceptual representations using language modeling

TL;DR

Abstract

Identifying and interpreting non-aligned human conceptual representations using language modeling

Authors

TL;DR

Abstract

Table of Contents

Figures (4)