Table of Contents
Fetching ...

Transformer-based Joint Modelling for Automatic Essay Scoring and Off-Topic Detection

Sourya Dipta Das, Yash Vadi, Kuldeep Yadav

TL;DR

This work tackles the AES vulnerability to off-topic responses by introducing Automated Open Essay Scoring (AOES), a transformer-based model that jointly scores on-topic essays and detects off-topic content. A Topic Regularization Module (TRM) on top of a BERT-like backbone decouples the regression pathway to mitigate score overestimation, trained with a hybrid loss that enforces topic-consistent calibration. Off-topic detection is performed unsupervised via a Mahalanobis-distance score computed from multi-layer latent representations, enabling effective flagging without requiring off-topic training data. Evaluations on ASAP-AES and PsyW-Essay show that AOES improves on-topic scoring and achieves robust off-topic detection, including resilience to adversarial perturbations, highlighting practical utility for real-world automated writing assessment systems.

Abstract

Automated Essay Scoring (AES) systems are widely popular in the market as they constitute a cost-effective and time-effective option for grading systems. Nevertheless, many studies have demonstrated that the AES system fails to assign lower grades to irrelevant responses. Thus, detecting the off-topic response in automated essay scoring is crucial in practical tasks where candidates write unrelated text responses to the given task in the question. In this paper, we are proposing an unsupervised technique that jointly scores essays and detects off-topic essays. The proposed Automated Open Essay Scoring (AOES) model uses a novel topic regularization module (TRM), which can be attached on top of a transformer model, and is trained using a proposed hybrid loss function. After training, the AOES model is further used to calculate the Mahalanobis distance score for off-topic essay detection. Our proposed method outperforms the baseline we created and earlier conventional methods on two essay-scoring datasets in off-topic detection as well as on-topic scoring. Experimental evaluation results on different adversarial strategies also show how the suggested method is robust for detecting possible human-level perturbations.

Transformer-based Joint Modelling for Automatic Essay Scoring and Off-Topic Detection

TL;DR

This work tackles the AES vulnerability to off-topic responses by introducing Automated Open Essay Scoring (AOES), a transformer-based model that jointly scores on-topic essays and detects off-topic content. A Topic Regularization Module (TRM) on top of a BERT-like backbone decouples the regression pathway to mitigate score overestimation, trained with a hybrid loss that enforces topic-consistent calibration. Off-topic detection is performed unsupervised via a Mahalanobis-distance score computed from multi-layer latent representations, enabling effective flagging without requiring off-topic training data. Evaluations on ASAP-AES and PsyW-Essay show that AOES improves on-topic scoring and achieves robust off-topic detection, including resilience to adversarial perturbations, highlighting practical utility for real-world automated writing assessment systems.

Abstract

Automated Essay Scoring (AES) systems are widely popular in the market as they constitute a cost-effective and time-effective option for grading systems. Nevertheless, many studies have demonstrated that the AES system fails to assign lower grades to irrelevant responses. Thus, detecting the off-topic response in automated essay scoring is crucial in practical tasks where candidates write unrelated text responses to the given task in the question. In this paper, we are proposing an unsupervised technique that jointly scores essays and detects off-topic essays. The proposed Automated Open Essay Scoring (AOES) model uses a novel topic regularization module (TRM), which can be attached on top of a transformer model, and is trained using a proposed hybrid loss function. After training, the AOES model is further used to calculate the Mahalanobis distance score for off-topic essay detection. Our proposed method outperforms the baseline we created and earlier conventional methods on two essay-scoring datasets in off-topic detection as well as on-topic scoring. Experimental evaluation results on different adversarial strategies also show how the suggested method is robust for detecting possible human-level perturbations.
Paper Structure (27 sections, 5 equations, 4 figures, 8 tables)

This paper contains 27 sections, 5 equations, 4 figures, 8 tables.

Figures (4)

  • Figure 1: Proposed AOES Model Architecture Diagram
  • Figure 2: A Overview of Off-Topic Detection Process Diagram. Here, the $D^t_{MD}$ function is used to calculate layer-wise Mahalanobis distance as mentioned in Equation \ref{['md_dist']}.
  • Figure 3: Histogram of Detection Scores using Various Methods on a single prompt from ASAP-AES dataset. Here, (a) AOES (b) Word Mover Distance shahzad2022computerization
  • Figure 4: Histogram of Detection Scores using Various Methods on a single prompt from PsyW-Essay dataset. Here, (a) AOES (b) Word Mover Distance shahzad2022computerization