Table of Contents
Fetching ...

What Makes Two Language Models Think Alike?

Jeanne Salle, Louis Jalouzot, Nur Lan, Emmanuel Chemla, Yair Lakretz

TL;DR

This work proposes a new approach, based on metric-learning encoding models (MLEMs), that provides a feature-based comparison of how any two layers of any two models represent linguistic information, and applies it to BERT, GPT-2 and Mamba.

Abstract

Do architectural differences significantly affect the way models represent and process language? We propose a new approach, based on metric-learning encoding models (MLEMs), as a first step to answer this question. The approach provides a feature-based comparison of how any two layers of any two models represent linguistic information. We apply the method to BERT, GPT-2 and Mamba. Unlike previous methods, MLEMs offer a transparent comparison, by identifying the specific linguistic features responsible for similarities and differences. More generally, the method uses formal, symbolic descriptions of a domain, and use these to compare neural representations. As such, the approach can straightforwardly be extended to other domains, such as speech and vision, and to other neural systems, including human brains.

What Makes Two Language Models Think Alike?

TL;DR

This work proposes a new approach, based on metric-learning encoding models (MLEMs), that provides a feature-based comparison of how any two layers of any two models represent linguistic information, and applies it to BERT, GPT-2 and Mamba.

Abstract

Do architectural differences significantly affect the way models represent and process language? We propose a new approach, based on metric-learning encoding models (MLEMs), as a first step to answer this question. The approach provides a feature-based comparison of how any two layers of any two models represent linguistic information. We apply the method to BERT, GPT-2 and Mamba. Unlike previous methods, MLEMs offer a transparent comparison, by identifying the specific linguistic features responsible for similarities and differences. More generally, the method uses formal, symbolic descriptions of a domain, and use these to compare neural representations. As such, the approach can straightforwardly be extended to other domains, such as speech and vision, and to other neural systems, including human brains.
Paper Structure (14 sections, 2 equations, 6 figures, 1 table)

This paper contains 14 sections, 2 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Marr's levels of analysis. While language models may share the same computational goal (top level, next-word prediction), their architectures could differ substantially (bottom level). They therefore may or may not develop the same representations and algorithms (middle level) to perform the task.
  • Figure 2: Model Similarity. (A) Feature-based similarity matrix corresponding to the pairwise correlations between feature-importance values. (B) Feature-agnostic similarity matrix based on raw Euclidean distances between word embeddings. The Multi-Dimensional Scaling representations of these distances are represented for both types of analyses (B stands for BERT, G for GPT2, and M for Mamba).
  • Figure 3: Feature Importance Profiles. The relative importance of linguistic features varies across layers and models
  • Figure 4: Illustrating how model/layers represent linguistic features. MDS plots of the representations, and pairwise comparison of the Feature Importance profiles.
  • Figure S5: Pairwise Pearson correlations among all linguistic features in the probing dataset.
  • ...and 1 more figures