Automated Answer Validation using Text Similarity

Balaji Ganesan; Arjun Ravikumar; Lakshay Piplani; Rini Bhaumik; Dhivya Padmanaban; Shwetha Narasimhamurthy; Chetan Adhikary; Subhash Deshapogu

Automated Answer Validation using Text Similarity

Balaji Ganesan, Arjun Ravikumar, Lakshay Piplani, Rini Bhaumik, Dhivya Padmanaban, Shwetha Narasimhamurthy, Chetan Adhikary, Subhash Deshapogu

TL;DR

The paper tackles automated answer validation in science QA by learning a text-similarity metric between student responses and correct answers derived from supporting text. It uses a Siamese neural network with a distance function $D(x,y)$ and losses such as $L = \max(d(a,p) - d(a,n) + \text{margin}, 0)$ to pull similar pairs together and separate dissimilar ones, and compares against SBERT-based cosine similarity and LLM-based baselines. On the SciQ dataset, the Siamese approach achieves $84.50\%$ accuracy, outperforming SBERT at $74.90\%$, with ablation analyses underscoring the value of leveraging full support text for comparison. The work also demonstrates a Streamlit-based deployment and discusses extensions to retrieval-augmented generation and LLM-driven validation for scalable educational assessment.

Abstract

Automated answer validation can help improve learning outcomes by providing appropriate feedback to learners, and by making question answering systems and online learning solutions more widely available. There have been some works in science question answering which show that information retrieval methods outperform neural methods, especially in the multiple choice version of this problem. We implement Siamese neural network models and produce a generalised solution to this problem. We compare our supervised model with other text similarity based solutions.

Automated Answer Validation using Text Similarity

TL;DR

and losses such as

to pull similar pairs together and separate dissimilar ones, and compares against SBERT-based cosine similarity and LLM-based baselines. On the SciQ dataset, the Siamese approach achieves

accuracy, outperforming SBERT at

, with ablation analyses underscoring the value of leveraging full support text for comparison. The work also demonstrates a Streamlit-based deployment and discusses extensions to retrieval-augmented generation and LLM-driven validation for scalable educational assessment.

Abstract

Paper Structure (18 sections, 2 equations, 4 figures, 2 tables)

This paper contains 18 sections, 2 equations, 4 figures, 2 tables.

Introduction
Related Work
Our approach
Methodology
Siamese Neural Networks
Learning in Siamese Networks
Experiments
Datasets
Benchmarks
Experimental setup
SBERT
Siamese Networks
Hyperparameter tuning
Results
Discussion
...and 3 more sections

Figures (4)

Figure 1: An example question, options, answer and support text from sciq dataset. Our task is to validate the user answer using the support text.
Figure 2: Automated Answer Validation using Text Similarity
Figure 3: Siamese Networks for Text Similarity in Automated Answer Validation
Figure 4: Automated Answer Evaluation system deployed on streamlit

Automated Answer Validation using Text Similarity

TL;DR

Abstract

Automated Answer Validation using Text Similarity

Authors

TL;DR

Abstract

Table of Contents

Figures (4)