Table of Contents
Fetching ...

Morphological Inflection Generation with Hard Monotonic Attention

Roee Aharoni, Yoav Goldberg

TL;DR

The paper presents a hard attention model for morphological inflection generation that enforces near-monotonic input-output alignments. It uses a biLSTM encoder and a decoder that either writes output symbols or advances the input pointer, trained on oracle action sequences derived from independent alignments. Across CELEX, Wiktionary, and SIGMORPHON datasets, the approach achieves state-of-the-art results, especially in low-resource settings, and demonstrates competitive decoding efficiency. Analyses compare hard and soft attention in terms of alignments and representations, offering insights into learned features for inflection tasks. The work suggests broader applicability to other monotonic align-and-transduce problems.

Abstract

We present a neural model for morphological inflection generation which employs a hard attention mechanism, inspired by the nearly-monotonic alignment commonly found between the characters in a word and the characters in its inflection. We evaluate the model on three previously studied morphological inflection generation datasets and show that it provides state of the art results in various setups compared to previous neural and non-neural approaches. Finally we present an analysis of the continuous representations learned by both the hard and soft attention \cite{bahdanauCB14} models for the task, shedding some light on the features such models extract.

Morphological Inflection Generation with Hard Monotonic Attention

TL;DR

The paper presents a hard attention model for morphological inflection generation that enforces near-monotonic input-output alignments. It uses a biLSTM encoder and a decoder that either writes output symbols or advances the input pointer, trained on oracle action sequences derived from independent alignments. Across CELEX, Wiktionary, and SIGMORPHON datasets, the approach achieves state-of-the-art results, especially in low-resource settings, and demonstrates competitive decoding efficiency. Analyses compare hard and soft attention in terms of alignments and representations, offering insights into learned features for inflection tasks. The work suggests broader applicability to other monotonic align-and-transduce problems.

Abstract

We present a neural model for morphological inflection generation which employs a hard attention mechanism, inspired by the nearly-monotonic alignment commonly found between the characters in a word and the characters in its inflection. We evaluate the model on three previously studied morphological inflection generation datasets and show that it provides state of the art results in various setups compared to previous neural and non-neural approaches. Finally we present an analysis of the continuous representations learned by both the hard and soft attention \cite{bahdanauCB14} models for the task, shedding some light on the features such models extract.

Paper Structure

This paper contains 16 sections, 6 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: The hard attention network architecture. A round tip expresses concatenation of the inputs it receives. The attention is promoted to the next input element once a step action is predicted.
  • Figure 2: Top: an alignment between a lemma $x_{1:n}$ and an inflection $y_{1:m}$ as predicted by the aligner. Bottom: $s_{1:q}$, the sequence of actions to be predicted by the network, as produced by Algorithm \ref{['alg:align2seq']} for the given alignment.
  • Figure 3: Learning curves for the soft and hard attention models on the first fold of the CELEX dataset
  • Figure 4: A comparison of the alignments as predicted by the soft attention (left) and the hard attention (right) models on examples from CELEX.
  • Figure 5: SVD dimension reduction to 2D of 500 character representations in context from the encoder, for both the soft attention (top) and hard attention (bottom) models.