Directed Graph-alignment Approach for Identification of Gaps in Short Answers
Archana Sahu, Plaban Kumar Bhowmick
TL;DR
The study addresses automatic gap identification in short student answers for formative assessment by modeling <model answer, student answer> as directed, labeled graphs and aligning them via a similarity-flows framework. It introduces the FA model, including canonical graph construction, predicate clustering, and the Similarity Flooding-based alignment with threshold, exact, and best filters to extract gaps without requiring labeled training data. Empirical results on gap-annotated datasets derived from UNT, SciEntsBank, and Beetle demonstrate dataset-dependent gains over a strong undirected baseline, with directionality-aware variants generally outperforming the baseline in most scenarios. The work advances unsupervised formative feedback by enabling fine-grained gap annotation, and provides a public dataset and code base to spur further research, while acknowledging limitations in information extraction quality and answer-size disparities.
Abstract
In this paper, we have presented a method for identifying missing items known as gaps in the student answers by comparing them against the corresponding model answer/reference answers, automatically. The gaps can be identified at word, phrase or sentence level. The identified gaps are useful in providing feedback to the students for formative assessment. The problem of gap identification has been modelled as an alignment of a pair of directed graphs representing a student answer and the corresponding model answer for a given question. To validate the proposed approach, the gap annotated student answers considering answers from three widely known datasets in the short answer grading domain, namely, University of North Texas (UNT), SciEntsBank, and Beetle have been developed and this gap annotated student answers' dataset is available at: https://github.com/sahuarchana7/gaps-answers-dataset. Evaluation metrics used in the traditional machine learning tasks have been adopted to evaluate the task of gap identification. Though performance of the proposed approach varies across the datasets and the types of the answers, overall the performance is observed to be promising.
