A Curious Class of Adpositional Multiword Expressions in Korean
Junghyun Min, Na-Rae Han, Jena D. Hwang, Nathan Schneider
TL;DR
This work addresses the gap of Korean adpositional MWEs in multilingual annotation frameworks by defining and analyzing postpositional verb-based constructions (PVCs). It builds a PVC inventory from a Korean Wikipedia crawl using a regex-based pipeline on a Mecab-processed corpus, followed by manual verification to ensure fossilized, non-predicative behavior. PVCs are shown to be fixed MWEs with limited inflection, distinct from light-verb constructions, and they appear in adnominal or connective forms rather than as main predicates. The authors propose annotation guidelines to align Korean PVCs with the PARSEME framework and advocate for comprehensive, cross-lingual annotation and future cross-domain studies to broaden coverage and refine morphosyntactic analyses.
Abstract
Multiword expressions (MWEs) have been widely studied in cross-lingual annotation frameworks such as PARSEME. However, Korean MWEs remain underrepresented in these efforts. In particular, Korean multiword adpositions lack systematic analysis, annotated resources, and integration into existing multilingual frameworks. In this paper, we study a class of Korean functional multiword expressions: postpositional verb-based constructions (PVCs). Using data from Korean Wikipedia, we survey and analyze several PVC expressions and contrast them with non-MWEs and light verb constructions (LVCs) with similar structure. Building on this analysis, we propose annotation guidelines designed to support future work in Korean multiword adpositions and facilitate alignment with cross-lingual frameworks.
