From Insights to Actions: The Impact of Interpretability and Analysis Research on NLP
Marius Mosbach, Vagrant Gautam, Tomás Vergara-Browne, Dietrich Klakow, Mor Geva
TL;DR
The paper tackles the critique that interpretability and analysis (IA) research in NLP lacks practical impact by quantifying IA's influence on the field. It integrates a large-scale bibliometric analysis of 185,384 ACL/EMNLP papers (2018–2023) with a survey of 138 NLP researchers and a qualitative annotation of 556 papers to assess IA's reach and influence. Findings show IA work is well-cited beyond IA, plays a central role in the NLP citation network, and informs the development of many non-IA methods, while not all highly influential non-IA work is driven by IA findings. The authors propose a forward-looking agenda emphasizing unified theory, actionable guidance, human-centered evaluation, and standardized methods to increase IA's impact, and they present a call to action for the community to advance IA research beyond descriptive insights.
Abstract
Interpretability and analysis (IA) research is a growing subfield within NLP with the goal of developing a deeper understanding of the behavior or inner workings of NLP systems and methods. Despite growing interest in the subfield, a criticism of this work is that it lacks actionable insights and therefore has little impact on NLP. In this paper, we seek to quantify the impact of IA research on the broader field of NLP. We approach this with a mixed-methods analysis of: (1) a citation graph of 185K+ papers built from all papers published at ACL and EMNLP conferences from 2018 to 2023, and their references and citations, and (2) a survey of 138 members of the NLP community. Our quantitative results show that IA work is well-cited outside of IA, and central in the NLP citation graph. Through qualitative analysis of survey responses and manual annotation of 556 papers, we find that NLP researchers build on findings from IA work and perceive it as important for progress in NLP, multiple subfields, and rely on its findings and terminology for their own work. Many novel methods are proposed based on IA findings and highly influenced by them, but highly influential non-IA work cites IA findings without being driven by them. We end by summarizing what is missing in IA work today and provide a call to action, to pave the way for a more impactful future of IA research.
