K-UD: Revising Korean Universal Dependencies Guidelines
Kyuwon Kim, Yige Chen, Eunkyul Leah Jo, KyungTae Lim, Jungyeul Park, Chulwoo Park
TL;DR
The paper critiques current Korean UD guidelines, highlighting head-selection inconsistencies in nominal phrases and ambiguities in handling clausal and non-core dependents. It proposes Revised Korean UD Guidelines that redefine nominal headhood (last noun in noun compounds) and specify core vs non-core relations ($nsubj$, $obj$, $iobj$, $csubj$, $ccomp$, $xcomp$), along with explicit $obl:arg$ and $dislocated:nsubj$ annotations. The authors validate the revisions by annotating 200 Sejong sentences using frame information from the Sejong dictionary, and discuss alignment with Sejong-style dependency structure to enable broader linguistic consensus. The goal is to harmonize UD-style and Sejong-style dependencies, integrating with the KLUE benchmark and Korean language resources to enhance Korean NLP tasks and treebank construction.
Abstract
Critique has surfaced concerning the existing linguistic annotation framework for Korean Universal Dependencies (UDs), particularly in relation to syntactic relationships. In this paper, our primary objective is to refine the definition of syntactic dependency of UDs within the context of analyzing the Korean language. Our aim is not only to achieve a consensus within UDs but also to garner agreement beyond the UD framework for analyzing Korean sentences using dependency structure, by establishing a linguistic consensus model.
