Unlocking Korean Verbs: A User-Friendly Exploration into the Verb Lexicon
Seohyun Song, Eunkyul Leah Jo, Yige Chen, Jeen-Pyo Hong, Kyuwon Kim, Jin Wee, Miyoung Kang, KyungTae Lim, Jungyeul Park, Chulwoo Park
TL;DR
The paper addresses making the Sejong dictionary's Korean verb subcategorization frames more usable for NLP by presenting a user-friendly web interface and a Python library (pySejongFrame). It details how the web interface maps subcategorization frames to sentence examples, while the library offers direct and lazy loading, NLTK integration, and robust corpus querying for frame semantics. It also positions Sejong relative to PropBank, NIKL SRL, and FrameNet, and outlines future plans to integrate additional resources toward a Korean VerbNet. The work aims to broaden access to Korean linguistic resources and support diverse language-processing applications, while noting licensing and static-data limitations.
Abstract
The Sejong dictionary dataset offers a valuable resource, providing extensive coverage of morphology, syntax, and semantic representation. This dataset can be utilized to explore linguistic information in greater depth. The labeled linguistic structures within this dataset form the basis for uncovering relationships between words and phrases and their associations with target verbs. This paper introduces a user-friendly web interface designed for the collection and consolidation of verb-related information, with a particular focus on subcategorization frames. Additionally, it outlines our efforts in mapping this information by aligning subcategorization frames with corresponding illustrative sentence examples. Furthermore, we provide a Python library that would simplify syntactic parsing and semantic role labeling. These tools are intended to assist individuals interested in harnessing the Sejong dictionary dataset to develop applications for Korean language processing.
