Table of Contents
Fetching ...

Overview of MWE history, challenges, and horizons: standing at the 20th anniversary of the MWE workshop series via MWE-UD2024

Lifeng Han, Kilian Evang, Archna Bhatia, Gosse Bouma, A. Seza Doğruöz, Marcos Garcia, Voula Giouli, Joakim Nivre, Alexandre Rademacher

TL;DR

This position paper surveys twenty years of MWE workshop history, synthesizing research topics, methodologies, and community resources that shaped MWE research in CL/NLP. It traces key milestones from PARSEME and UD integrations to joint workshops with LAW-MWE-CxG and Clinical-NLP, and it highlights shared-task progress on vMWEs. It discusses persistent challenges, notably unseen vMWEs and the need for broader data resources for non-verbal MWEs, while outlining opportunities in social-media text, domain adaptation, and LLM interpretability. The paper argues for future research horizons that expand MWE resources, cross-lingual consistency, and domain-specific applications, providing guidance to researchers, students, and practitioners.

Abstract

Starting in 2003 when the first MWE workshop was held with ACL in Sapporo, Japan, this year, the joint workshop of MWE-UD co-located with the LREC-COLING 2024 conference marked the 20th anniversary of MWE workshop events over the past nearly two decades. Standing at this milestone, we look back to this workshop series and summarise the research topics and methodologies researchers have carried out over the years. We also discuss the current challenges that we are facing and the broader impacts/synergies of MWE research within the CL and NLP fields. Finally, we give future research perspectives. We hope this position paper can help researchers, students, and industrial practitioners interested in MWE get a brief but easy understanding of its history, current, and possible future.

Overview of MWE history, challenges, and horizons: standing at the 20th anniversary of the MWE workshop series via MWE-UD2024

TL;DR

This position paper surveys twenty years of MWE workshop history, synthesizing research topics, methodologies, and community resources that shaped MWE research in CL/NLP. It traces key milestones from PARSEME and UD integrations to joint workshops with LAW-MWE-CxG and Clinical-NLP, and it highlights shared-task progress on vMWEs. It discusses persistent challenges, notably unseen vMWEs and the need for broader data resources for non-verbal MWEs, while outlining opportunities in social-media text, domain adaptation, and LLM interpretability. The paper argues for future research horizons that expand MWE resources, cross-lingual consistency, and domain-specific applications, providing guidance to researchers, students, and practitioners.

Abstract

Starting in 2003 when the first MWE workshop was held with ACL in Sapporo, Japan, this year, the joint workshop of MWE-UD co-located with the LREC-COLING 2024 conference marked the 20th anniversary of MWE workshop events over the past nearly two decades. Standing at this milestone, we look back to this workshop series and summarise the research topics and methodologies researchers have carried out over the years. We also discuss the current challenges that we are facing and the broader impacts/synergies of MWE research within the CL and NLP fields. Finally, we give future research perspectives. We hope this position paper can help researchers, students, and industrial practitioners interested in MWE get a brief but easy understanding of its history, current, and possible future.

Paper Structure

This paper contains 11 sections.