Table of Contents
Fetching ...

TrueReason: An Exemplar Personalised Learning System Integrating Reasoning with Foundational Models

Sahan Bulathwela, Daniel Van Niekerk, Jarrod Shipton, Maria Perez-Ortiz, Benjamin Rosman, John Shawe-Taylor

TL;DR

TrueReason presents a society-of-minds inspired personalised learning system that fuses a collection of specialised AI micro-skills with a reasoning-enabled LLM to perform planning, reasoning, and adaptive instruction in education. Central components include a Wikification-based domain model, an open learner model via TrueLearn, and a situation model that manages dialogue and engagement, enabling dynamic knowledge reviews and resource recommendations. The work introduces two micro-skills—an RL-driven multi-step educational recommender using a two-TrueLearn proxy training environment with Deep Deterministic Policy Gradient, and a topic-controlled educational question generator supported by novel datasets (SQuAD+, MixSQuAD, MixKhanQ)—with initial results showing promising trajectory optimization and topic-aligned questioning, though challenges remain in ensuring fully answerable, high-quality generated content. Collectively, TrueReason demonstrates a scalable, modular framework for integrating planning with foundational models to tackle complex educational tasks, enabling society-first AI, cross-modal lifelong learning, and the prospect of richer, more autonomous, pedagogically guided learning companions.

Abstract

Personalised education is one of the domains that can greatly benefit from the most recent advances in Artificial Intelligence (AI) and Large Language Models (LLM). However, it is also one of the most challenging applications due to the cognitive complexity of teaching effectively while personalising the learning experience to suit independent learners. We hypothesise that one promising approach to excelling in such demanding use cases is using a \emph{society of minds}. In this chapter, we present TrueReason, an exemplar personalised learning system that integrates a multitude of specialised AI models that can mimic micro skills that are composed together by a LLM to operationalise planning and reasoning. The architecture of the initial prototype is presented while describing two micro skills that have been incorporated in the prototype. The proposed system demonstrates the first step in building sophisticated AI systems that can take up very complex cognitive tasks that are demanded by domains such as education.

TrueReason: An Exemplar Personalised Learning System Integrating Reasoning with Foundational Models

TL;DR

TrueReason presents a society-of-minds inspired personalised learning system that fuses a collection of specialised AI micro-skills with a reasoning-enabled LLM to perform planning, reasoning, and adaptive instruction in education. Central components include a Wikification-based domain model, an open learner model via TrueLearn, and a situation model that manages dialogue and engagement, enabling dynamic knowledge reviews and resource recommendations. The work introduces two micro-skills—an RL-driven multi-step educational recommender using a two-TrueLearn proxy training environment with Deep Deterministic Policy Gradient, and a topic-controlled educational question generator supported by novel datasets (SQuAD+, MixSQuAD, MixKhanQ)—with initial results showing promising trajectory optimization and topic-aligned questioning, though challenges remain in ensuring fully answerable, high-quality generated content. Collectively, TrueReason demonstrates a scalable, modular framework for integrating planning with foundational models to tackle complex educational tasks, enabling society-first AI, cross-modal lifelong learning, and the prospect of richer, more autonomous, pedagogically guided learning companions.

Abstract

Personalised education is one of the domains that can greatly benefit from the most recent advances in Artificial Intelligence (AI) and Large Language Models (LLM). However, it is also one of the most challenging applications due to the cognitive complexity of teaching effectively while personalising the learning experience to suit independent learners. We hypothesise that one promising approach to excelling in such demanding use cases is using a \emph{society of minds}. In this chapter, we present TrueReason, an exemplar personalised learning system that integrates a multitude of specialised AI models that can mimic micro skills that are composed together by a LLM to operationalise planning and reasoning. The architecture of the initial prototype is presented while describing two micro skills that have been incorporated in the prototype. The proposed system demonstrates the first step in building sophisticated AI systems that can take up very complex cognitive tasks that are demanded by domains such as education.

Paper Structure

This paper contains 46 sections, 5 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: The architecture of a personalised learning system that interacts with the learner using a user interface.
  • Figure 2: The technical architecture of the TrueReason system coordinates the interaction between an intelligent assistant that orchestrates the core conversation of the learner and the API coordinator that acts as the interface between the TrueReason core and the micro-skills.
  • Figure 3: The logical components of the TrueReason learning assistant consist of the blue parts that are learning resources, the orange components belong to the core AI assistant that entails 1) the domain model (that is mapped to Wikipedia topics), 2) the learner model, and 3) the pedagogical model, which includes the engagement model and the situation model that interacts with the learner and the green part represents the user interface. The components (e.g. micro skills) that are not implemented at present use dashed lines.
  • Figure 4: Given a goal topic the KC graph over all the content is pruned to a smaller number of related topics which is subsequently labelled by GPT-4 according to the set of relations (shown in red, green, and blue) including prerequisites to and applications of the focal topic (shown in navy blue).
  • Figure 5: The architecture of the training environment using two instances of TrueLearn to act as the human proxy to simulate engagement and update the ground truth knowledge state (GTKS) in the first instance, and to update approximated knowledge state (AKS) in the second instance after being given a recommended educational resource from the RL algorithm.
  • ...and 4 more figures