Table of Contents
Fetching ...

Decoding Probing: Revealing Internal Linguistic Structures in Neural Language Models using Minimal Pairs

Linyang He, Peili Chen, Ercong Nie, Yuanning Li, Jonathan R. Brennan

TL;DR

A novel “decoding probing” method that uses minimal pairs benchmark (BLiMP) to probe internal linguistic characteristics in neural language models layer by layer reveals that self-supervised language models capture abstract linguistic structures in intermediate layers that GloVe and RNN language models cannot learn.

Abstract

Inspired by cognitive neuroscience studies, we introduce a novel `decoding probing' method that uses minimal pairs benchmark (BLiMP) to probe internal linguistic characteristics in neural language models layer by layer. By treating the language model as the `brain' and its representations as `neural activations', we decode grammaticality labels of minimal pairs from the intermediate layers' representations. This approach reveals: 1) Self-supervised language models capture abstract linguistic structures in intermediate layers that GloVe and RNN language models cannot learn. 2) Information about syntactic grammaticality is robustly captured through the first third layers of GPT-2 and also distributed in later layers. As sentence complexity increases, more layers are required for learning grammatical capabilities. 3) Morphological and semantics/syntax interface-related features are harder to capture than syntax. 4) For Transformer-based models, both embeddings and attentions capture grammatical features but show distinct patterns. Different attention heads exhibit similar tendencies toward various linguistic phenomena, but with varied contributions.

Decoding Probing: Revealing Internal Linguistic Structures in Neural Language Models using Minimal Pairs

TL;DR

A novel “decoding probing” method that uses minimal pairs benchmark (BLiMP) to probe internal linguistic characteristics in neural language models layer by layer reveals that self-supervised language models capture abstract linguistic structures in intermediate layers that GloVe and RNN language models cannot learn.

Abstract

Inspired by cognitive neuroscience studies, we introduce a novel `decoding probing' method that uses minimal pairs benchmark (BLiMP) to probe internal linguistic characteristics in neural language models layer by layer. By treating the language model as the `brain' and its representations as `neural activations', we decode grammaticality labels of minimal pairs from the intermediate layers' representations. This approach reveals: 1) Self-supervised language models capture abstract linguistic structures in intermediate layers that GloVe and RNN language models cannot learn. 2) Information about syntactic grammaticality is robustly captured through the first third layers of GPT-2 and also distributed in later layers. As sentence complexity increases, more layers are required for learning grammatical capabilities. 3) Morphological and semantics/syntax interface-related features are harder to capture than syntax. 4) For Transformer-based models, both embeddings and attentions capture grammatical features but show distinct patterns. Different attention heads exhibit similar tendencies toward various linguistic phenomena, but with varied contributions.
Paper Structure (20 sections, 1 equation, 9 figures, 1 table)

This paper contains 20 sections, 1 equation, 9 figures, 1 table.

Figures (9)

  • Figure 1: Decoding probing pipeline. Note: in the current study, we didn't include any actual experiments on the brain; it's just for demonstrating the core idea as the dashed lines also imply.
  • Figure 2: Decoding probing results of GPT-2 XL, ELMo and GloVe. The symbols '***', '**', and '*' denote t-test p-values less than 0.001, 0.01, and 0.05, respectively.
  • Figure 3: Decoding probing results of GPT-2 XL. Pink lines represent morphology, yellow lines highlight the semantics/syntax interface, and blue lines correspond to syntax.
  • Figure 4: Sentence complexity and feature capture depth shows strong linear correlation in GPT-2 XL.
  • Figure 5: Feature capture depth for 12 linguistic phenomena, derived from 41 tasks after filtering out those where GloVe's F1 score exceeds 0.9.
  • ...and 4 more figures