Table of Contents
Fetching ...

Learning inflection classes using Adaptive Resonance Theory

Peter Dekker, Heikki Rasilo, Bart de Boer

TL;DR

The paper investigates how inflection classes can be learned as cognitive abstractions by unsupervised clustering, using Adaptive Resonance Theory (ART1) with a tunable vigilance parameter to control generalisation. It applies a trigram-based binary encoding of selected paradigm cells to Latin, Portuguese, and Estonian data, evaluating clustering against linguistically annotated classes with ARI and AMI, and comparing to a k-means baseline. Results show near-perfect clustering for Latin in a narrow vigilance window, moderate success for Portuguese, and intermediate performance for Estonian, with all languages demonstrating some generalisation to unseen data. The authors discuss implications for modeling language change via diachronic agent-based simulations and outline future work to improve data representations and automatic vigilance tuning.

Abstract

The concept of inflection classes is an abstraction used by linguists, and provides a means to describe patterns in languages that give an analogical base for deducing previously unencountered forms. This ability is an important part of morphological acquisition and processing. We study the learnability of a system of verbal inflection classes by the individual language user by performing unsupervised clustering of lexemes into inflection classes. As a cognitively plausible and interpretable computational model, we use Adaptive Resonance Theory, a neural network with a parameter that determines the degree of generalisation (vigilance). The model is applied to Latin, Portuguese and Estonian. The similarity of clustering to attested inflection classes varies depending on the complexity of the inflectional system. We find the best performance in a narrow region of the generalisation parameter. The learned features extracted from the model show similarity with linguistic descriptions of the inflection classes. The proposed model could be used to study change in inflection classes in the future, by including it in an agent-based model.

Learning inflection classes using Adaptive Resonance Theory

TL;DR

The paper investigates how inflection classes can be learned as cognitive abstractions by unsupervised clustering, using Adaptive Resonance Theory (ART1) with a tunable vigilance parameter to control generalisation. It applies a trigram-based binary encoding of selected paradigm cells to Latin, Portuguese, and Estonian data, evaluating clustering against linguistically annotated classes with ARI and AMI, and comparing to a k-means baseline. Results show near-perfect clustering for Latin in a narrow vigilance window, moderate success for Portuguese, and intermediate performance for Estonian, with all languages demonstrating some generalisation to unseen data. The authors discuss implications for modeling language change via diachronic agent-based simulations and outline future work to improve data representations and automatic vigilance tuning.

Abstract

The concept of inflection classes is an abstraction used by linguists, and provides a means to describe patterns in languages that give an analogical base for deducing previously unencountered forms. This ability is an important part of morphological acquisition and processing. We study the learnability of a system of verbal inflection classes by the individual language user by performing unsupervised clustering of lexemes into inflection classes. As a cognitively plausible and interpretable computational model, we use Adaptive Resonance Theory, a neural network with a parameter that determines the degree of generalisation (vigilance). The model is applied to Latin, Portuguese and Estonian. The similarity of clustering to attested inflection classes varies depending on the complexity of the inflectional system. We find the best performance in a narrow region of the generalisation parameter. The learned features extracted from the model show similarity with linguistic descriptions of the inflection classes. The proposed model could be used to study change in inflection classes in the future, by including it in an agent-based model.

Paper Structure

This paper contains 19 sections, 6 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Illustration of the phases of training of the ART1 network. Step 1) Input sample $\boldsymbol{x}$ is propagated through the bottom-up weights, and the output node with the highest activation is selected as the hypothesised category for this sample. Step 2) Top-down weights for this category are used to access the category template $\boldsymbol{y}$. Logical and operation leads to the shared feature vector $\boldsymbol{z}$ between the template and the input. If the match $M$ between $\boldsymbol{z}$ and $\boldsymbol{x}$ is higher or equal than the vigilance value $\rho$, resonance occurs and the category template is updated to match $\boldsymbol{z}$, and the bottom-up weights are updated accordingly. If vigilance value is not reached, the search restarts from the category with the next highest activation.
  • Figure 2: Results ART1 for Latin for different vigilance values (95% confidence intervals over 10 random permutations of data). Left figure: solid line = AMI (Adjusted Mutual Information), dashed line = ARI (Adjusted Rand Index).
  • Figure 3: Assigned lexemes per cluster, single run ART1 for Latin (vigilance 0.06). Bar = cluster, colour = attested inflection class of assigned lexemes.
  • Figure 4: Results ART1 for Portuguese for different vigilance values (95% confidence intervals over 10 random permutations of data). Left figure: solid line = AMI (Adjusted Mutual Information), dashed line = ARI (Adjusted Rand Index).
  • Figure 5: Assigned lexemes per cluster, single run ART1 for Portuguese (vigilance 0.02). Bar = cluster, colour = attested inflection class of assigned lexemes. Only showing inflection classes with more than 10 lexemes.
  • ...and 3 more figures