An Automated Explainable Educational Assessment System Built on LLMs
Jiazheng Li, Artem Bobrov, David West, Cesare Aloisi, Yulan He
TL;DR
An Automated Explainable Educational Assessment System Built on LLMs addresses the explainability and annotation-cost challenges of automated student answer scoring by leveraging large language models to produce both marks and rationales. The paper introduces AERA Chat, a microservices-based platform with a bulk marking workflow, explainable highlighting, an annotation toolkit, performance evaluation, and a chat interface, supporting both public and private LLMs. It automates question initialization, scoring, rationale generation, and visualization while enabling ground-truth corrections, human preferences, and rationale annotations to facilitate RLHF and supervised fine-tuning. This work demonstrates a practical, interactive framework for trustworthy, explainable automated assessment with potential for broad adoption in educational settings.
Abstract
In this demo, we present AERA Chat, an automated and explainable educational assessment system designed for interactive and visual evaluations of student responses. This system leverages large language models (LLMs) to generate automated marking and rationale explanations, addressing the challenge of limited explainability in automated educational assessment and the high costs associated with annotation. Our system allows users to input questions and student answers, providing educators and researchers with insights into assessment accuracy and the quality of LLM-assessed rationales. Additionally, it offers advanced visualization and robust evaluation tools, enhancing the usability for educational assessment and facilitating efficient rationale verification. Our demo video can be found at https://youtu.be/qUSjz-sxlBc.
