HowkGPT: Investigating the Detection of ChatGPT-generated University Student Homework through Context-Aware Perplexity Analysis
Christoforos Vasilatos, Manaar Alam, Talal Rahwan, Yasir Zaki, Michail Maniatakos
TL;DR
The paper tackles the challenge of detecting AI-generated university homework by introducing HowkGPT, a perplexity-based detector that leverages metadata-driven, category-specific thresholds. It relies on a pretrained GPT-2 model to compute perplexities on a dataset of student and ChatGPT responses, augmented by knowledge and cognitive process categorizations. ROC-AUC and F1 metrics guide optimal thresholds, with experiments showing improved accuracy when applying category-based thresholds and dataset flavors that filter noise. The work also provides an offline-and-online workflow and a public web application, contributing a practical framework to uphold academic integrity amid evolving LLM capabilities.
Abstract
As the use of Large Language Models (LLMs) in text generation tasks proliferates, concerns arise over their potential to compromise academic integrity. The education sector currently tussles with distinguishing student-authored homework assignments from AI-generated ones. This paper addresses the challenge by introducing HowkGPT, designed to identify homework assignments generated by AI. HowkGPT is built upon a dataset of academic assignments and accompanying metadata [17] and employs a pretrained LLM to compute perplexity scores for student-authored and ChatGPT-generated responses. These scores then assist in establishing a threshold for discerning the origin of a submitted assignment. Given the specificity and contextual nature of academic work, HowkGPT further refines its analysis by defining category-specific thresholds derived from the metadata, enhancing the precision of the detection. This study emphasizes the critical need for effective strategies to uphold academic integrity amidst the growing influence of LLMs and provides an approach to ensuring fair and accurate grading in educational institutions.
