Explainable Automatic Grading with Neural Additive Models

Aubrey Condor; Zachary Pardos

Explainable Automatic Grading with Neural Additive Models

Aubrey Condor, Zachary Pardos

TL;DR

The paper addresses the explainability gap in ASAG by adopting Neural Additive Models (NAMs), which express predictions as a sum of univariate feature functions $g(E[y]) = \sum_{i=1}^K f_i(x_i)$ to maintain interpretability. It leverages Knowledge Integration (KI) rubrics to engineer 62 features derived from semantic similarity between rubric phrases and response n-grams using sentence-BERT embeddings, enabling NAMs and logistic regression to be trained on the same features. Compared against DeBERTaV3-base and LR, NAMs offer a transparent view of per-feature contributions and deliver competitive performance, outperforming LR on KI data and approaching DeBERTa, though DeBERTa remains the strongest overall. The findings suggest NAMs can provide useful, explainable scoring insights for educators while maintaining solid predictive power, with potential for expansion to additional domains and deeper usability studies.

Abstract

The use of automatic short answer grading (ASAG) models may help alleviate the time burden of grading while encouraging educators to frequently incorporate open-ended items in their curriculum. However, current state-of-the-art ASAG models are large neural networks (NN) often described as "black box", providing no explanation for which characteristics of an input are important for the produced output. This inexplicable nature can be frustrating to teachers and students when trying to interpret, or learn from an automatically-generated grade. To create a powerful yet intelligible ASAG model, we experiment with a type of model called a Neural Additive Model that combines the performance of a NN with the explainability of an additive model. We use a Knowledge Integration (KI) framework from the learning sciences to guide feature engineering to create inputs that reflect whether a student includes certain ideas in their response. We hypothesize that indicating the inclusion (or exclusion) of predefined ideas as features will be sufficient for the NAM to have good predictive power and interpretability, as this may guide a human scorer using a KI rubric. We compare the performance of the NAM with another explainable model, logistic regression, using the same features, and to a non-explainable neural model, DeBERTa, that does not require feature engineering.

Explainable Automatic Grading with Neural Additive Models

TL;DR

The paper addresses the explainability gap in ASAG by adopting Neural Additive Models (NAMs), which express predictions as a sum of univariate feature functions

to maintain interpretability. It leverages Knowledge Integration (KI) rubrics to engineer 62 features derived from semantic similarity between rubric phrases and response n-grams using sentence-BERT embeddings, enabling NAMs and logistic regression to be trained on the same features. Compared against DeBERTaV3-base and LR, NAMs offer a transparent view of per-feature contributions and deliver competitive performance, outperforming LR on KI data and approaching DeBERTa, though DeBERTa remains the strongest overall. The findings suggest NAMs can provide useful, explainable scoring insights for educators while maintaining solid predictive power, with potential for expansion to additional domains and deeper usability studies.

Abstract

Paper Structure (14 sections, 1 equation, 5 figures, 2 tables)

This paper contains 14 sections, 1 equation, 5 figures, 2 tables.

Introduction
Related Work
Explainable AI and ASAG
Applications of Neural Additive Models
Background
The Data
Neural Additive Models
Logistic Regression
DeBERTa
Methods
Feature Engineering
Evaluation
Results
Discussion

Figures (5)

Figure 1: The Sound Waves item bundle
Figure 2: The KI Scoring Rubric
Figure 3: NAM Mean Feature Importance
Figure 4: NAM shape functions of the highest and lowest rating category for the top 8 most important phrases/words. The y-axes show the log odds of predicting a given rating category, and the x-axes represent the range of similarity scores. The pink shades represent the data density at varying similarity scores.
Figure 5: An Example of NAM shape functions with all rating categories

Explainable Automatic Grading with Neural Additive Models

TL;DR

Abstract

Explainable Automatic Grading with Neural Additive Models

Authors

TL;DR

Abstract

Table of Contents

Figures (5)