Learning to Use AI for Learning: Teaching Responsible Use of AI Chatbot to K-12 Students Through an AI Literacy Module
Ruiwei Xiao, Xinying Hou, Ying-Jui Tseng, Hsuan Nieu, Guanze Liao, John Stamper, Kenneth R. Koedinger
TL;DR
<3-5 sentence high-level summary>This paper addresses the need to cultivate prompting literacy for responsible AI use among K-12 students by designing a web-based, LLM-driven module that provides scenario-based practice and elaborated feedback via an AI auto-grader. It reports two classroom studies showing high auto-grading accuracy across dimensions, improvements in prompting context embedding and learner confidence, and insightful student perceptions. An iterative assessment design in Study 2 demonstrates improved item difficulty, discrimination, and reliability, while highlighting ongoing challenges and the need for larger-scale validation. The work provides a scalable, data-informed approach to AI literacy instruction and offers concrete guidance for broader deployment and future research on assessment design in K-12 AI education.
Abstract
As Artificial Intelligence (AI) becomes increasingly integrated into daily life, there is a growing need to equip the next generation with the ability to apply, interact with, evaluate, and collaborate with AI systems responsibly. Prior research highlights the urgent demand from K-12 educators to teach students the ethical and effective use of AI for learning. To address this need, we designed an Large-Language Model (LLM)-based module to teach prompting literacy. This includes scenario-based deliberate practice activities with direct interaction with intelligent LLM agents, aiming to foster secondary school students' responsible engagement with AI chatbots. We conducted two iterations of classroom deployment in 11 authentic secondary education classrooms, and evaluated 1) AI-based auto-grader's capability; 2) students' prompting performance and confidence changes towards using AI for learning; and 3) the quality of learning and assessment materials. Results indicated that the AI-based auto-grader could grade student-written prompts with satisfactory quality. In addition, the instructional materials supported students in improving their prompting skills through practice and led to positive shifts in their perceptions of using AI for learning. Furthermore, data from Study 1 informed assessment revisions in Study 2. Analyses of item difficulty and discrimination in Study 2 showed that True/False and open-ended questions could measure prompting literacy more effectively than multiple-choice questions for our target learners. These promising outcomes highlight the potential for broader deployment and highlight the need for broader studies to assess learning effectiveness and assessment design.
