Transfer Learning for Automated Feedback Generation on Small Datasets
Oscar Morris
TL;DR
The paper tackles Automated Feedback Generation for very small datasets with long input sequences by proposing a three-stage transfer learning pipeline using a Longformer encoder-decoder. It pre-trains on large abstractive summarisation and peer-review corpora (Arxiv and PeerRead) before fine-tuning on a tiny 70-sample dataset, achieving strong ROUGE metrics and qualitatively useful feedback without handcrafted features. The results demonstrate that pretraining substantially boosts learning on scarce data, while the approach reveals limitations in readability and challenges posed by very long texts. The work highlights the practical potential for AI-assisted feedback in education and emphasizes the need for careful real-world deployment and data considerations.
Abstract
Feedback is a very important part the learning process. However, it is challenging to make this feedback both timely and accurate when relying on human markers. This is the challenge that Automated Feedback Generation attempts to address. In this paper, a technique to train such a system on a very small dataset with very long sequences is presented. Both of these attributes make this a very challenging task, however, by using a three stage transfer learning pipeline state-of-the-art results can be achieved with qualitatively accurate but unhuman sounding results. The use of both Automated Essay Scoring and Automated Feedback Generation systems in the real world is also discussed.
