Table of Contents
Fetching ...

UKTA: Unified Korean Text Analyzer

Seokho Ahn, Junhyung Park, Ganghee Go, Chulhui Kim, Jiho Jung, Myung Sun Shin, Do-Guk Kim, Young-Duk Seo

TL;DR

The paper tackles the challenge of evaluating Korean writing quality in a timely and explainable way. It proposes UKTA, a Unified Korean Text Analyzer, with a three-stage pipeline: accurate morpheme analysis, mid-level lexical feature extraction across 294 metrics, and a rubric-based automatic writing evaluation using an attention-augmented model that fuses sentence-level KoBERT/BiGRU representations with essay-level features. On AI-HUB's Essay Evaluation Dataset, UKTA improves both accuracy and Quadratic Weighted Kappa ($QWK$) compared with a baseline, with improvements in 9 of 10 rubrics and informative feature-level explanations. The system outputs transparent rubric scores and top contributing features, enabling reliable and explainable feedback for Korean learners.

Abstract

Evaluating writing quality is complex and time-consuming often delaying feedback to learners. While automated writing evaluation tools are effective for English, Korean automated writing evaluation tools face challenges due to their inability to address multi-view analysis, error propagation, and evaluation explainability. To overcome these challenges, we introduce UKTA (Unified Korean Text Analyzer), a comprehensive Korea text analysis and writing evaluation system. UKTA provides accurate low-level morpheme analysis, key lexical features for mid-level explainability, and transparent high-level rubric-based writing scores. Our approach enhances accuracy and quadratic weighted kappa over existing baseline, positioning UKTA as a leading multi-perspective tool for Korean text analysis and writing evaluation.

UKTA: Unified Korean Text Analyzer

TL;DR

The paper tackles the challenge of evaluating Korean writing quality in a timely and explainable way. It proposes UKTA, a Unified Korean Text Analyzer, with a three-stage pipeline: accurate morpheme analysis, mid-level lexical feature extraction across 294 metrics, and a rubric-based automatic writing evaluation using an attention-augmented model that fuses sentence-level KoBERT/BiGRU representations with essay-level features. On AI-HUB's Essay Evaluation Dataset, UKTA improves both accuracy and Quadratic Weighted Kappa () compared with a baseline, with improvements in 9 of 10 rubrics and informative feature-level explanations. The system outputs transparent rubric scores and top contributing features, enabling reliable and explainable feedback for Korean learners.

Abstract

Evaluating writing quality is complex and time-consuming often delaying feedback to learners. While automated writing evaluation tools are effective for English, Korean automated writing evaluation tools face challenges due to their inability to address multi-view analysis, error propagation, and evaluation explainability. To overcome these challenges, we introduce UKTA (Unified Korean Text Analyzer), a comprehensive Korea text analysis and writing evaluation system. UKTA provides accurate low-level morpheme analysis, key lexical features for mid-level explainability, and transparent high-level rubric-based writing scores. Our approach enhances accuracy and quadratic weighted kappa over existing baseline, positioning UKTA as a leading multi-perspective tool for Korean text analysis and writing evaluation.

Paper Structure

This paper contains 12 sections, 1 equation, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Illustrative overview of UKTA.
  • Figure 2: UKTA functionality. (A) Functionality in morpheme analysis results: Providing both table (A-1) and list (A-1) format, with an interactive and intuitive intuitive interface; results can be downloaded in JSON and TXT formats (A-3). (B) Functionality in lexical feature analysis results: Provided as categorized lexical features (B-1) with a list format (B-2); results can be downloaded in TXT and CSV format with selected features (B-3).