Enhancing Critical Thinking in Education by means of a Socratic Chatbot

Lucile Favero; Juan Antonio Pérez-Ortiz; Tanja Käser; Nuria Oliver

Enhancing Critical Thinking in Education by means of a Socratic Chatbot

Lucile Favero, Juan Antonio Pérez-Ortiz, Tanja Käser, Nuria Oliver

TL;DR

This paper presents an innovative educational chatbot designed to foster critical thinking through Socratic questioning, based on small LLMs that are able to run locally on off-the-shelf hardware.

Abstract

While large language models (LLMs) are increasingly playing a pivotal role in education by providing instantaneous, adaptive responses, their potential to promote critical thinking remains understudied. In this paper, we fill such a gap and present an innovative educational chatbot designed to foster critical thinking through Socratic questioning. Unlike traditional intelligent tutoring systems, including educational chatbots, that tend to offer direct answers, the proposed Socratic tutor encourages students to explore various perspectives and engage in self-reflection by posing structured, thought-provoking questions. Our Socratic questioning is implemented by fine and prompt-tuning the open-source pretrained LLM with a specialized dataset that stimulates critical thinking and offers multiple viewpoints. In an effort to democratize access and to protect the students' privacy, the proposed tutor is based on small LLMs (Llama2 7B and 13B-parameter models) that are able to run locally on off-the-shelf hardware. We validate our approach in a battery of experiments consisting of interactions between a simulated student and the chatbot to evaluate its effectiveness in enhancing critical thinking skills. Results indicate that the Socratic tutor supports the development of reflection and critical thinking significantly better than standard chatbots. Our approach opens the door for improving educational outcomes by cultivating active learning and encouraging intellectual autonomy.

Enhancing Critical Thinking in Education by means of a Socratic Chatbot

TL;DR

This paper presents an innovative educational chatbot designed to foster critical thinking through Socratic questioning, based on small LLMs that are able to run locally on off-the-shelf hardware.

Abstract

Paper Structure (23 sections, 6 figures, 12 tables)

This paper contains 23 sections, 6 figures, 12 tables.

Introduction
Related work
Educational chatbots
Critical thinking
Definition in education.
Chatbots for critical thinking.
Socratic questioning
Definition in education.
Chatbots for Socratic questioning.
Contributions
Socratic tutor implementation
Socratic fine-tuning
Hyperparameters.
Fine-tuning dataset.
Socratic prompt-tuning
...and 8 more sections

Figures (6)

Figure 1: METEOR scores of the different types of tutors, averaged over 20 conversations of 5 turns each and for each of the 5 ToK questions. The differences between the Socratic tutor and the non-Socratic tutors are statistically significant (t-test, p-value<0.001). No statistically significant difference is observed in the performance of both Socratic tutors.
Figure 2: BERT scores of the different types of tutors, averaged over 20 conversations of 5 turns each and for each of the 5 ToK questions. The differences between the Socratic tutors and the non-Socratic tutors are statistically significant (t-test, p-value<0.001). No statistically significant difference is observed in the performance of both Socratic tutors.
Figure 3: Critical thinking scores obtained by means of the LLM-score for the different types of tutors. The plots display the average values over 20 conversations of 5 turns for each of the 5 ToK questions. The differences between the Socratic tutor and the non-Socratic tutors are statistically significant (t-test, p-value<0.001). No statistically significant difference is observed in the performance of both Socratic tutors.
Figure 4: BLEU scores of the different types of tutors, averaged over 20 conversations of 5 turns each and for each of the 5 ToK questions. The differences between the Socratic tutors and the non-Socratic tutors are statistically significant (t-test, p-value<0.001). No statistically significant difference is observed in the performance of both Socratic tutors.
Figure 5: ROUGE-L scores of the different types of tutors, averaged over 20 conversations of 5 turns each and for each of the 5 ToK questions. The differences between the Socratic tutors and the non-Socratic tutors are statistically significant (t-test, p-value<0.001). No statistically significant difference is observed in the performance of both Socratic tutors.
...and 1 more figures

Enhancing Critical Thinking in Education by means of a Socratic Chatbot

TL;DR

Abstract

Enhancing Critical Thinking in Education by means of a Socratic Chatbot

Authors

TL;DR

Abstract

Table of Contents

Figures (6)