Table of Contents
Fetching ...

Directed Graph-alignment Approach for Identification of Gaps in Short Answers

Archana Sahu, Plaban Kumar Bhowmick

TL;DR

The study addresses automatic gap identification in short student answers for formative assessment by modeling <model answer, student answer> as directed, labeled graphs and aligning them via a similarity-flows framework. It introduces the FA model, including canonical graph construction, predicate clustering, and the Similarity Flooding-based alignment with threshold, exact, and best filters to extract gaps without requiring labeled training data. Empirical results on gap-annotated datasets derived from UNT, SciEntsBank, and Beetle demonstrate dataset-dependent gains over a strong undirected baseline, with directionality-aware variants generally outperforming the baseline in most scenarios. The work advances unsupervised formative feedback by enabling fine-grained gap annotation, and provides a public dataset and code base to spur further research, while acknowledging limitations in information extraction quality and answer-size disparities.

Abstract

In this paper, we have presented a method for identifying missing items known as gaps in the student answers by comparing them against the corresponding model answer/reference answers, automatically. The gaps can be identified at word, phrase or sentence level. The identified gaps are useful in providing feedback to the students for formative assessment. The problem of gap identification has been modelled as an alignment of a pair of directed graphs representing a student answer and the corresponding model answer for a given question. To validate the proposed approach, the gap annotated student answers considering answers from three widely known datasets in the short answer grading domain, namely, University of North Texas (UNT), SciEntsBank, and Beetle have been developed and this gap annotated student answers' dataset is available at: https://github.com/sahuarchana7/gaps-answers-dataset. Evaluation metrics used in the traditional machine learning tasks have been adopted to evaluate the task of gap identification. Though performance of the proposed approach varies across the datasets and the types of the answers, overall the performance is observed to be promising.

Directed Graph-alignment Approach for Identification of Gaps in Short Answers

TL;DR

The study addresses automatic gap identification in short student answers for formative assessment by modeling <model answer, student answer> as directed, labeled graphs and aligning them via a similarity-flows framework. It introduces the FA model, including canonical graph construction, predicate clustering, and the Similarity Flooding-based alignment with threshold, exact, and best filters to extract gaps without requiring labeled training data. Empirical results on gap-annotated datasets derived from UNT, SciEntsBank, and Beetle demonstrate dataset-dependent gains over a strong undirected baseline, with directionality-aware variants generally outperforming the baseline in most scenarios. The work advances unsupervised formative feedback by enabling fine-grained gap annotation, and provides a public dataset and code base to spur further research, while acknowledging limitations in information extraction quality and answer-size disparities.

Abstract

In this paper, we have presented a method for identifying missing items known as gaps in the student answers by comparing them against the corresponding model answer/reference answers, automatically. The gaps can be identified at word, phrase or sentence level. The identified gaps are useful in providing feedback to the students for formative assessment. The problem of gap identification has been modelled as an alignment of a pair of directed graphs representing a student answer and the corresponding model answer for a given question. To validate the proposed approach, the gap annotated student answers considering answers from three widely known datasets in the short answer grading domain, namely, University of North Texas (UNT), SciEntsBank, and Beetle have been developed and this gap annotated student answers' dataset is available at: https://github.com/sahuarchana7/gaps-answers-dataset. Evaluation metrics used in the traditional machine learning tasks have been adopted to evaluate the task of gap identification. Though performance of the proposed approach varies across the datasets and the types of the answers, overall the performance is observed to be promising.

Paper Structure

This paper contains 35 sections, 36 equations, 10 figures, 7 tables.

Figures (10)

  • Figure 1: An example for extraction of a gap in student answer using System-I. The dashed lines indicate alignment of node-pairs from the pair of answer graphs. The elliptical portion indicates the gap detected by System-I in the student answer.
  • Figure 2: Workflow for directed graph-alignment approach towards FA
  • Figure 3: Canonical answer graphs for a $\langle$Model Answer, Student Answer$\rangle$ pair. The graphs obtained by removing the group IDs on the edges represent the original answer graphs and those obtained by having only the group IDs represent the canonical answer graphs. The nodes are also assigned with IDs to bring in aesthetic clarity in the subsequent illustrations.
  • Figure 4: Pairwise Connectivity Graph (PCG) from $G_M$ and $G_S$
  • Figure 5: Construction of Induced Propagation Graph from PCG. The disconnected nodes from the PCG are not shown in the figure for simplicity.
  • ...and 5 more figures

Theorems & Definitions (11)

  • Definition 1
  • Example 4.1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Example 7.1
  • Example 7.2
  • Example 8.1
  • ...and 1 more