Topic Discovery and Classification for Responsible Generative AI Adaptation in Higher Education
Diane Myung-kyung Woodbridge, Allyson Seba, Freddie Seba, Aydin Schwartz
TL;DR
The paper tackles the lack of consistent GenAI guidance in higher education by building a scalable pipeline that collects policy texts, discovers latent policy topics with BERTopic+UMAP+clustering, and classifies policy statements into eight categories using LLMs (notably GPT-4.0). It reports a topic coherence of 0.73 and GPT-4.0 classification precision/recall around 0.92–0.97 and 0.85–0.97, respectively, across topics, and demonstrates integration into a web app for policy-aware EdTech. The approach yields structured, interpretable policy insights to promote safe, equitable GenAI use and to guide platform-based enforcement. The authors also discuss limitations, resource needs, and plans for ongoing updates and K–12 extensions.
Abstract
As generative artificial intelligence (GenAI) becomes increasingly capable of delivering personalized learning experiences and real-time feedback, a growing number of students are incorporating these tools into their academic workflows. They use GenAI to clarify concepts, solve complex problems, and, in some cases, complete assignments by copying and pasting model-generated contents. While GenAI has the potential to enhance learning experience, it also raises concerns around misinformation, hallucinated outputs, and its potential to undermine critical thinking and problem-solving skills. In response, many universities, colleges, departments, and instructors have begun to develop and adopt policies to guide responsible integration of GenAI into learning environments. However, these policies vary widely across institutions and contexts, and their evolving nature often leaves students uncertain about expectations and best practices. To address this challenge, the authors designed and implemented an automated system for discovering and categorizing AI-related policies found in course syllabi and institutional policy websites. The system combines unsupervised topic modeling techniques to identify key policy themes with large language models (LLMs) to classify the level of GenAI allowance and other requirements in policy texts. The developed application achieved a coherence score of 0.73 for topic discovery. In addition, GPT-4.0-based classification of policy categories achieved precision between 0.92 and 0.97, and recall between 0.85 and 0.97 across eight identified topics. By providing structured and interpretable policy information, this tool promotes the safe, equitable, and pedagogically aligned use of GenAI technologies in education. Furthermore, the system can be integrated into educational technology platforms to help students understand and comply with relevant guidelines.
