Natural Language Processing RELIES on Linguistics
Juri Opitz, Shira Wein, Nathan Schneider
TL;DR
The paper argues that linguistics remains essential to NLP despite the rise of large language models, and introduces the RELIES framework to categorize six facets where linguistic knowledge informs practice: Resources, Evaluation, Low-resource settings, Interpretability, Explanation, and the Study of language. It surveys how linguistics shapes data resources (e.g., Universal Dependencies, Abstract Meaning Representation), evaluation protocols (gold standards, human/meta-evaluation, linguistically informed metrics), low-resource methodologies (global/historic languages, endangered languages, compute-efficient strategies, sensitive supervision), interpretability practices (binding internal representations to linguistic concepts), and the study of language as an application domain. Through these sections, the authors illustrate concrete examples, challenges, and opportunities for integrating linguistic insight into NLP workflows, even amid state-of-the-art neural models. The paper emphasizes collaborative, interdisciplinary approaches—especially for endangered languages, community-driven supervision, and the incorporation of symbolic linguistic criteria into interpretability and evaluation—to ensure robust, responsible, and linguistically informed NLP progress.
Abstract
Large Language Models (LLMs) have become capable of generating highly fluent text in certain languages, without modules specially designed to capture grammar or semantic coherence. What does this mean for the future of linguistic expertise in NLP? We highlight several aspects in which NLP (still) relies on linguistics, or where linguistic thinking can illuminate new directions. We argue our case around the acronym RELIES that encapsulates six major facets where linguistics contributes to NLP: Resources, Evaluation, Low-resource settings, Interpretability, Explanation, and the Study of language. This list is not exhaustive, nor is linguistics the main point of reference for every effort under these themes; but at a macro level, these facets highlight the enduring importance of studying machine systems vis-à-vis systems of human language.
