Table of Contents
Fetching ...

CALaMo: a Constructionist Assessment of Language Models

Ludovica Pannitto, Aurélie Herbelot

TL;DR

A usage-based model in line with the un- derlying stochastic philosophy of neural architectures, but it also allows the linguist to keep meaning as a determinant factor in the analysis.

Abstract

This paper presents a novel framework for evaluating Neural Language Models' linguistic abilities using a constructionist approach. Not only is the usage-based model in line with the underlying stochastic philosophy of neural architectures, but it also allows the linguist to keep meaning as a determinant factor in the analysis. We outline the framework and present two possible scenarios for its application.

CALaMo: a Constructionist Assessment of Language Models

TL;DR

A usage-based model in line with the un- derlying stochastic philosophy of neural architectures, but it also allows the linguist to keep meaning as a determinant factor in the analysis.

Abstract

This paper presents a novel framework for evaluating Neural Language Models' linguistic abilities using a constructionist approach. Not only is the usage-based model in line with the underlying stochastic philosophy of neural architectures, but it also allows the linguist to keep meaning as a determinant factor in the analysis. We outline the framework and present two possible scenarios for its application.
Paper Structure (17 sections, 5 equations, 4 figures, 2 tables)

This paper contains 17 sections, 5 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Let's assume that $\Lambda$ contains both contructions DET NOUN and the dog, with the latter being a lexicalized instance of the former. At different steps during acquisition, the two constructions can assume different meanings and be therefore associated with different distributional vectors. A distributional vector condenses in fact information about co-occurrences between linguistic items in a given piece of text. In the figure, we see that a cluster of vectors gather around DET NOUN in the constructicon built from the input data (leftmost panel). This means that a variety of lexicalized instances exist for the construction DET NOUN. During learning, the constructicons built from generated output show different distributions for the construction DET NOUN. In the central panel, the cosine distance between DET NOUN and the dog is 0, meaning that their distributional contexts (i.e., their co-occurrences) perfectly overlap. In the rightmost panel instead, the distance between the two vectors has increased as another lexicalized instance (i.e., the cat) is being produced. In this scenario, the contexts where DET NOUN appears do not perfectly overlap with those where the dog appears.
  • Figure 2: The dependency representation of the sentence Mary had a little lamb, annotated with morpho-syntactic and syntactic information. In this structure, we can identify the following catenae: a, b, c, d, e, ab, abce, abde, abcde, abe, bce, bde, be, ce, de, cde. Other possibilities would have been strings (e.g., a, ab, abc, ... b, bc, ...e) or constituents (i.e., a, abcde, c, d, cde).
  • Figure 3: Distribution of average cosine similarities for the three groups of $kappa_j$, showing low, intermediate and high average shifts respectively.
  • Figure 4: Difference in input frequency between the three groups of constructions: core as the ones shared by all speakers, periphery as the ones shared by half of the speakers or less, and other as the remaining ones.