Enhancing Essay Scoring with Adversarial Weights Perturbation and Metric-specific AttentionPooling

Jiaxin Huang; Xinyu Zhao; Chang Che; Qunwei Lin; Bo Liu

Enhancing Essay Scoring with Adversarial Weights Perturbation and Metric-specific AttentionPooling

Jiaxin Huang, Xinyu Zhao, Chang Che, Qunwei Lin, Bo Liu

TL;DR

This work targets English Language Learners by extending automated essay scoring with DeBERTa augmented by Adversarial Weights Perturbation and Metric-specific AttentionPooling across six dimensions of writing. By systematically tuning hyperparameters such as adv_lr and adv_eps within a 5-fold cross-validation framework, the approach improves language-proficiency evaluation on the ELLIPSE corpus, achieving a competitive MCRMSE. The combination of DeBERTa-v3-large, 6AP, and AWP yields the strongest performance, underscoring the value of metric-aware attention and adversarial training for tailoring feedback to ELL needs. The findings imply meaningful enhancements to automated feedback tools for language development and point to future work in hyperparameter optimization and dataset diversification to broaden applicability and robustness in AES for ELLs.

Abstract

The objective of this study is to improve automated feedback tools designed for English Language Learners (ELLs) through the utilization of data science techniques encompassing machine learning, natural language processing, and educational data analytics. Automated essay scoring (AES) research has made strides in evaluating written essays, but it often overlooks the specific needs of English Language Learners (ELLs) in language development. This study explores the application of BERT-related techniques to enhance the assessment of ELLs' writing proficiency within AES. To address the specific needs of ELLs, we propose the use of DeBERTa, a state-of-the-art neural language model, for improving automated feedback tools. DeBERTa, pretrained on large text corpora using self-supervised learning, learns universal language representations adaptable to various natural language understanding tasks. The model incorporates several innovative techniques, including adversarial training through Adversarial Weights Perturbation (AWP) and Metric-specific AttentionPooling (6 kinds of AP) for each label in the competition. The primary focus of this research is to investigate the impact of hyperparameters, particularly the adversarial learning rate, on the performance of the model. By fine-tuning the hyperparameter tuning process, including the influence of 6AP and AWP, the resulting models can provide more accurate evaluations of language proficiency and support tailored learning tasks for ELLs. This work has the potential to significantly benefit ELLs by improving their English language proficiency and facilitating their educational journey.

Enhancing Essay Scoring with Adversarial Weights Perturbation and Metric-specific AttentionPooling

TL;DR

Abstract

Enhancing Essay Scoring with Adversarial Weights Perturbation and Metric-specific AttentionPooling

Authors

TL;DR

Abstract

Table of Contents

Figures (4)