Tracking linguistic information in transformer-based sentence embeddings through targeted sparsification

Vivi Nastase; Paola Merlo

Tracking linguistic information in transformer-based sentence embeddings through targeted sparsification

Vivi Nastase, Paola Merlo

TL;DR

This work addresses how linguistic information is encoded in transformer-based sentence embeddings by testing targeted sparsification of a CNN-based encoder that compresses sentences into a latent vector of length $5$ and reshapes embeddings to $32\times24$. By enforcing disjoint connections from CNN channels to latent units and tracing signals back to embedding regions, the authors localize chunk-structure information (noun, verb, prepositional phrases) to specific small regions of the embedding. The approach preserves task performance on chunk-focused problems (Blackbird Language Matrices) with only modest drops under sparsification, and reveals that chunk information can be recovered and localized in the bottom portions of the embedding. These findings advance explainability of transformer sentence representations and suggest concrete directions for constructing interpretable neural models in structured linguistic tasks, using a two-level VAE architecture and targeted locality analyses.

Abstract

Analyses of transformer-based models have shown that they encode a variety of linguistic information from their textual input. While these analyses have shed a light on the relation between linguistic information on one side, and internal architecture and parameters on the other, a question remains unanswered: how is this linguistic information reflected in sentence embeddings? Using datasets consisting of sentences with known structure, we test to what degree information about chunks (in particular noun, verb or prepositional phrases), such as grammatical number, or semantic role, can be localized in sentence embeddings. Our results show that such information is not distributed over the entire sentence embedding, but rather it is encoded in specific regions. Understanding how the information from an input text is compressed into sentence embeddings helps understand current transformer models and help build future explainable neural models.

Tracking linguistic information in transformer-based sentence embeddings through targeted sparsification

TL;DR

and reshapes embeddings to

. By enforcing disjoint connections from CNN channels to latent units and tracing signals back to embedding regions, the authors localize chunk-structure information (noun, verb, prepositional phrases) to specific small regions of the embedding. The approach preserves task performance on chunk-focused problems (Blackbird Language Matrices) with only modest drops under sparsification, and reveals that chunk information can be recovered and localized in the bottom portions of the embedding. These findings advance explainability of transformer sentence representations and suggest concrete directions for constructing interpretable neural models in structured linguistic tasks, using a two-level VAE architecture and targeted locality analyses.

Abstract

Paper Structure (20 sections, 1 equation, 8 figures, 1 table)

This paper contains 20 sections, 1 equation, 8 figures, 1 table.

Introduction
Related work
Sentence embeddings
Probing embeddings and models for linguistic information
Sparsification
Approach overview
Data
A dataset of sentences
Multiple Choice Problems: Blackbird Language Matrices
Datasets statistics
Experiments
Sparsification
Sparsification results
Localizing linguistic information in sentence embeddings
Localization results
...and 5 more sections

Figures (8)

Figure 1: Structure of two BLM problems, in terms of chunks in sentences and sequence structure.
Figure 2: Details of the encoder architecture
Figure 3: Separating linguistic signals by masking the one-layer FFNN
Figure 4: TSNE projection of the latent layer for encoder-decoder with full network connections.
Figure 6: Average cosine distance between value distributions in each CNN output node (i.e. each node corresponding to the application of the kernel from each channel on the sentence embeddings, according to the kernel size and stride) for sets of sentences with minimally different patters: (left) patterns differ in only one grammatical number attribute for one chunk, (middle) patterns differ only in length, (right) patterns differ only in the number of the subject and verb. Each panel corresponds to one region of the sentence embedding the size of the kernel. The y-axis represents the channels of the CNN. The x-axis represents the latent units in different colours (the stronger the color, the higher the value, max = 1), and the pairs of compared patterns represented as adjacent rectangles.
...and 3 more figures

Tracking linguistic information in transformer-based sentence embeddings through targeted sparsification

TL;DR

Abstract

Tracking linguistic information in transformer-based sentence embeddings through targeted sparsification

Authors

TL;DR

Abstract

Table of Contents

Figures (8)