Table of Contents
Fetching ...

Low-Dimensional Structure in the Space of Language Representations is Reflected in Brain Responses

Richard Antonello, Javier Turek, Vy Vo, Alexander Huth

TL;DR

The paper introduces a representation-embedding framework that uses an encoder-decoder transfer setup to map 100 language representations into a low-dimensional embedding space. This embedding reveals a coherent structure, largely captured by the first two dimensions, showing a progression from word embeddings to deeper language-model layers and semantic tagging tasks. The authors demonstrate that this embedding aligns with human brain representations by predicting fMRI encoding-model performance and mapping a brain-language hierarchy along the principal embedding dimension. The approach offers a general template for quantifying relationships among linguistic representations and their neural correlates, with potential extensions to richer task sets and nonlinear transfer models.

Abstract

How related are the representations learned by neural language models, translation models, and language tagging tasks? We answer this question by adapting an encoder-decoder transfer learning method from computer vision to investigate the structure among 100 different feature spaces extracted from hidden representations of various networks trained on language tasks. This method reveals a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings. We call this low-dimensional structure a language representation embedding because it encodes the relationships between representations needed to process language for a variety of NLP tasks. We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI. Additionally, we find that the principal dimension of this structure can be used to create a metric which highlights the brain's natural language processing hierarchy. This suggests that the embedding captures some part of the brain's natural language representation structure.

Low-Dimensional Structure in the Space of Language Representations is Reflected in Brain Responses

TL;DR

The paper introduces a representation-embedding framework that uses an encoder-decoder transfer setup to map 100 language representations into a low-dimensional embedding space. This embedding reveals a coherent structure, largely captured by the first two dimensions, showing a progression from word embeddings to deeper language-model layers and semantic tagging tasks. The authors demonstrate that this embedding aligns with human brain representations by predicting fMRI encoding-model performance and mapping a brain-language hierarchy along the principal embedding dimension. The approach offers a general template for quantifying relationships among linguistic representations and their neural correlates, with potential extensions to richer task sets and nonlinear transfer models.

Abstract

How related are the representations learned by neural language models, translation models, and language tagging tasks? We answer this question by adapting an encoder-decoder transfer learning method from computer vision to investigate the structure among 100 different feature spaces extracted from hidden representations of various networks trained on language tasks. This method reveals a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings. We call this low-dimensional structure a language representation embedding because it encodes the relationships between representations needed to process language for a variety of NLP tasks. We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI. Additionally, we find that the principal dimension of this structure can be used to create a metric which highlights the brain's natural language processing hierarchy. This suggests that the embedding captures some part of the brain's natural language representation structure.

Paper Structure

This paper contains 26 sections, 4 equations, 13 figures, 1 table.

Figures (13)

  • Figure 1: The encoder-decoder strategy used in our method, adapted from Zamir_2018_CVPR. $\mathcal{S}$ is the natural language stimuli. We chose to represent stimuli in the universal input feature space $U(\mathcal{S})$ as GloVe word embeddings. Encoders $E_{t_i}$ were trained using a bottlenecked linear encoder-decoder network, which outputs to $t_i(\mathcal{S})$ (blue arrows). The decoding half of this network was then discarded, and the encoding half used to generate a latent space $\mathcal{L}_{t_i}$ for each representation $t_i$. Then, a decoder $D_{t_i \rightarrow t_j}$ is trained from each latent space $i$ to each representation $j$ (orange arrows). The performance of decoders that map to the same final representation are then compared to one another.
  • Figure 2: Language Representation Embeddings with Low Dimensionality: (Left): The representation embedding matrix $\mathbf{R}$ shows how well a given linguistic feature space (encoder, columns) transfers to another feature space (decoder, rows). For better visualization, rows and columns corresponding to different layers from the same network have been scaled down in this plot. A full-scale matrix is in the supplementary material. (Right): Applying multi-dimensional scaling to the representation embedding matrix reveals low-dimensional structure in the linguistic feature spaces. It is dominated by a left-to-right progression from the input word embedding, to syntactic and semantic tagging tasks near the middle layers of language models, to the next word embedding. Multidimensional scaling was weighted such that each full model had equal weight, ensuring that language models were not more influential on account of having more layers. The dominant main diagonal was set to 0.1 to preserve the effects of off-diagonal values. The scree plot in the lower left shows that these first two dimensions explain substantially more variance (22% and 10%) than other dimensions, demonstrating that the structure in this space is low-dimensional.
  • Figure 3: Embedding Brain Voxels in the first MDS dimension: Projection of the encoding performance vectors for each voxel in one subject (lower center flatmap and all 3D views) and averaged over all subjects within anatomical regions (upper center flatmap) over the 100 representations onto the first MDS dimension of the representation embeddings, which explains about 20% of the variance in the representation embeddings. Voxels with high values in this embedding (red) are better explained by representations that are more positive on the main MDS dimension (e.g. later language model layers), and voxels with low values (blue) are better explained by representations that are more negative (e.g. word embeddings). This dimension is notable as it is the main dimension along which language representations evolve from "earlier" representations such as word embeddings, to "later" representations such as intermediate layers in deep language models. Anatomical ROIs were defined automatically in each subject using Freesurfer with the Destrieux 2009 atlas destrieux2010automatic. Similar maps for the other subjects and a plot showing the numerical projection of regions in the MDS space are shown in Appendix \ref{['app:MDS1extra']}.
  • Figure 4: Representation Embeddings Reflect Brain Responses: (Left): Discriminability score matrix $\mathbf{M}$, the average across each subject matrix $\mathbf{M}_x$, which is computed as described in Section 4.2.4. For each representation, we fit and tested encoding models that predict the fMRI response in each cortical voxel, yielding a pattern of prediction performance across the brain. For each pair, we then tested whether the brain patterns could be correctly matched to the representations on the basis of the representation embeddings shown in Figure 2. Highly discriminable pairs appear red, non-discriminable pairs white, and pairs that are less discriminable than expected by chance appear blue. Most pairs of representations yield brain patterns that are easy to distinguish using the embeddings, suggesting that these embeddings reflect the structure of representations in the human brain. However, some pairs are similar enough in both embedding and brain that discrimination between them falls to chance level, such as the word embeddings (green labels) and nearby layers of language models. (Upper Right): The percentage of pairwise matches for each representation where the match is correct more often than not (on 3 or more subjects). Almost all representations can be correctly matched with their corresponding performance vector $\mathbf{p}$, with the interpretable representations being the most difficult to distinguish. (Lower Right): Mean voxelwise correlation for encoding models built using each representation of the natural language story stimuli. As seen in other literature Schrimpf2020.06.26.174482, the intermediate layers of Transformer-based language models work best as encoding model representations.
  • Figure 5: Generation and Use of the Pairwise Tournament Matrix $\mathbf{W}$: The decoders from a given latent space are compared in the generation of the pairwise tournament matrix for task $t_2$$\mathbf{W_{t_2}}$. The proportion of the data for which each decoder outperforms the other is measured and compared. Diagonals of this matrix are set to 0. The eigenvector of this matrix with the highest eigenvalue is then assigned to be the row of our representation embedding matrix $\mathbf{R}$ corresponding to that matrix's encoded task. Eigenvectors and eigenvalues are computed using the differential quotient-difference algorithm. Complex components of the eigenvalue are ignored.
  • ...and 8 more figures