Table of Contents
Fetching ...

Small Language Models for Curriculum-based Guidance

Konstantinos Katharakis, Sippo Rossi, Raghava Rao Mukkamala

TL;DR

This work addresses the need for curriculum-aligned, privacy-conscious AI tutoring by deploying a retrieval-augmented generation pipeline across eight open-source small language models and one closed LLM, benchmarked against GPT-4o on a university mathematics curriculum. Grounded in a 726-slide course corpus and using a Chroma vector store with OpenAI embeddings, the approach enforces pedagogy via a system prompt and context retrieval to minimize hallucinations and ensure alignment with course materials. Results show that several open-source SLMs can match GPT-4o on theory and assignment tasks under proper prompting and RAG grounding, with substantial reductions in hallucinations and the ability to run in real-time on on-premises hardware. The work highlights practical benefits in cost, privacy, and sustainability, suggesting that scalable, energy-efficient AI teaching assistants are feasible for university deployment, and outlines concrete paths for future refinement and broader evaluation.

Abstract

The adoption of generative AI and large language models (LLMs) in education is still emerging. In this study, we explore the development and evaluation of AI teaching assistants that provide curriculum-based guidance using a retrieval-augmented generation (RAG) pipeline applied to selected open-source small language models (SLMs). We benchmarked eight SLMs, including LLaMA 3.1, IBM Granite 3.3, and Gemma 3 (7-17B parameters), against GPT-4o. Our findings show that with proper prompting and targeted retrieval, SLMs can match LLMs in delivering accurate, pedagogically aligned responses. Importantly, SLMs offer significant sustainability benefits due to their lower computational and energy requirements, enabling real-time use on consumer-grade hardware without depending on cloud infrastructure. This makes them not only cost-effective and privacy-preserving but also environmentally responsible, positioning them as viable AI teaching assistants for educational institutions aiming to scale personalized learning in a sustainable and energy-efficient manner.

Small Language Models for Curriculum-based Guidance

TL;DR

This work addresses the need for curriculum-aligned, privacy-conscious AI tutoring by deploying a retrieval-augmented generation pipeline across eight open-source small language models and one closed LLM, benchmarked against GPT-4o on a university mathematics curriculum. Grounded in a 726-slide course corpus and using a Chroma vector store with OpenAI embeddings, the approach enforces pedagogy via a system prompt and context retrieval to minimize hallucinations and ensure alignment with course materials. Results show that several open-source SLMs can match GPT-4o on theory and assignment tasks under proper prompting and RAG grounding, with substantial reductions in hallucinations and the ability to run in real-time on on-premises hardware. The work highlights practical benefits in cost, privacy, and sustainability, suggesting that scalable, energy-efficient AI teaching assistants are feasible for university deployment, and outlines concrete paths for future refinement and broader evaluation.

Abstract

The adoption of generative AI and large language models (LLMs) in education is still emerging. In this study, we explore the development and evaluation of AI teaching assistants that provide curriculum-based guidance using a retrieval-augmented generation (RAG) pipeline applied to selected open-source small language models (SLMs). We benchmarked eight SLMs, including LLaMA 3.1, IBM Granite 3.3, and Gemma 3 (7-17B parameters), against GPT-4o. Our findings show that with proper prompting and targeted retrieval, SLMs can match LLMs in delivering accurate, pedagogically aligned responses. Importantly, SLMs offer significant sustainability benefits due to their lower computational and energy requirements, enabling real-time use on consumer-grade hardware without depending on cloud infrastructure. This makes them not only cost-effective and privacy-preserving but also environmentally responsible, positioning them as viable AI teaching assistants for educational institutions aiming to scale personalized learning in a sustainable and energy-efficient manner.

Paper Structure

This paper contains 9 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Final architecture
  • Figure 2: Average performance of models on theory questions based on running the model 10x
  • Figure 3: Average performance of models on course assignment/guidance questions based on running the model 10x