LLM-assisted Semantic Option Discovery for Facilitating Adaptive Deep Reinforcement Learning

Chang Yao; Jinghui Qin; Kebing Jin; Hankz Hankui Zhuo

LLM-assisted Semantic Option Discovery for Facilitating Adaptive Deep Reinforcement Learning

Chang Yao, Jinghui Qin, Kebing Jin, Hankz Hankui Zhuo

TL;DR

A novel LLM-driven closed-loop framework is introduced, which enables semantic-driven skill reuse and real-time constraint monitoring by mapping natural language instructions into executable rules and semantically annotating automatically created options.

Abstract

Despite achieving remarkable success in complex tasks, Deep Reinforcement Learning (DRL) is still suffering from critical issues in practical applications, such as low data efficiency, lack of interpretability, and limited cross-environment transferability. However, the learned policy generating actions based on states are sensitive to the environmental changes, struggling to guarantee behavioral safety and compliance. Recent research shows that integrating Large Language Models (LLMs) with symbolic planning is promising in addressing these challenges. Inspired by this, we introduce a novel LLM-driven closed-loop framework, which enables semantic-driven skill reuse and real-time constraint monitoring by mapping natural language instructions into executable rules and semantically annotating automatically created options. The proposed approach utilizes the general knowledge of LLMs to facilitate exploration efficiency and adapt to transferable options for similar environments, and provides inherent interpretability through semantic annotations. To validate the effectiveness of this framework, we conduct experiments on two domains, Office World and Montezuma's Revenge, respectively. The results demonstrate superior performance in data efficiency, constraint compliance, and cross-task transferability.

LLM-assisted Semantic Option Discovery for Facilitating Adaptive Deep Reinforcement Learning

TL;DR

Abstract

Paper Structure (25 sections, 8 equations, 5 figures, 1 algorithm)

This paper contains 25 sections, 8 equations, 5 figures, 1 algorithm.

Introduction
Problem Definition
Preliminaries
Symbolic Planning with PDDL
Reinforcement Learning
Option Framework
Large Language Models
The LLM-SOARL Framework
Planning-Meta-Control Module
Semantic Skill Generation Module
Semantic Labels Generation
Skill Library Expansion
Skill Reusage
Constraint Adaptation Module
LLM and External Restrictions
...and 10 more sections

Figures (5)

Figure 1: Exploration of Natural Language Limitations Graph
Figure 2: The LLM-SOARL framework
Figure 3: Generate semantic labels using LLMs
Figure 4: The Office World Result
Figure 5: The Montezuma's Revenge Result

LLM-assisted Semantic Option Discovery for Facilitating Adaptive Deep Reinforcement Learning

TL;DR

Abstract

LLM-assisted Semantic Option Discovery for Facilitating Adaptive Deep Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)