Hey AI Can You Grade My Essay?: Automatic Essay Grading
Maisha Maliha, Vishal Pramanik
TL;DR
This paper tackles holistic automatic essay grading (AEG) and the limitations of single-network systems. It introduces a Collaborative Deep Learning Network (CDLN) that fuses CNN, Recursive Neural Network (RvNN), and LSTM components with a dense output head, leveraging Word2Vec embeddings and transfer learning to jointly address grammatical/structural features and overall essay ideas. Empirical results on ASAP AES show CDLN achieving an average accuracy of $0.8550$, with PCC $0.7545$ and QWK $0.7036$, outperforming baselines including SVM and BERT, and demonstrating robustness to paraphrasing. The work highlights the value of collaborative, multi-network architectures for holistic scoring and suggests future directions involving more pretrained models and expanded domain adaptation for scalable educational assessment.
Abstract
Automatic essay grading (AEG) has attracted the the attention of the NLP community because of its applications to several educational applications, such as scoring essays, short answers, etc. AEG systems can save significant time and money when grading essays. In the existing works, the essays are graded where a single network is responsible for the whole process, which may be ineffective because a single network may not be able to learn all the features of a human-written essay. In this work, we have introduced a new model that outperforms the state-of-the-art models in the field of AEG. We have used the concept of collaborative and transfer learning, where one network will be responsible for checking the grammatical and structural features of the sentences of an essay while another network is responsible for scoring the overall idea present in the essay. These learnings are transferred to another network to score the essay. We also compared the performances of the different models mentioned in our work, and our proposed model has shown the highest accuracy of 85.50%.
