Recent advancements in computational morphology : A comprehensive survey
Jatayu Baxi, Brijesh Bhatt
TL;DR
This survey assesses computational morphology from traditional rule-based methods (Two-Level Morphology, FSTs, paradigms, and stemmers) to data-driven machine learning and contemporary Transformer-based models. It catalogs unsupervised and supervised learning approaches, dives into deep architectures for segmentation and tagging, and highlights joint lemma–MSD modeling and cross-lingual transfer for low-resource languages. The paper also reviews major morphology datasets (UD Treebanks, UniMorph) and analyzes the relative strengths, data requirements, and applicability across languages. It concludes with open research issues, including data scarcity, interpretability, and the need for broad, multilingual resources to advance robust morphology tooling.
Abstract
Computational morphology handles the language processing at the word level. It is one of the foundational tasks in the NLP pipeline for the development of higher level NLP applications. It mainly deals with the processing of words and word forms. Computational Morphology addresses various sub problems such as morpheme boundary detection, lemmatization, morphological feature tagging, morphological reinflection etc. In this paper, we present exhaustive survey of the methods for developing computational morphology related tools. We survey the literature in the chronological order starting from the conventional methods till the recent evolution of deep neural network based approaches. We also review the existing datasets available for this task across the languages. We discuss about the effectiveness of neural model compared with the traditional models and present some unique challenges associated with building the computational morphology tools. We conclude by discussing some recent and open research issues in this field.
