Table of Contents
Fetching ...

Mobile Robots through Task-Based Human Instructions using Incremental Curriculum Learning

Muhammad A. Muttaqien, Ayanori Yorozu, Akihisa Ohya

TL;DR

The paper tackles enabling mobile robots to follow task-based human instructions in indoor environments by integrating incremental curriculum learning with deep reinforcement learning. It proposes a Multimodal Deep Q Network that fuses RGB observations with textual goals within the AI2-THOR simulator, and designs a staged curriculum by decomposing instructions and embedding words with GloVe. Key findings show that incremental curriculum learning improves task accomplishment and generalization, with a sensitivity analysis guiding hyperparameter choices and indicating that reward shaping may be less critical when a structured curriculum is used. This work advances instruction-driven robot navigation and highlights avenues for future improvements in attention-based text processing and broader curricula for unseen instructions.

Abstract

This paper explores the integration of incremental curriculum learning (ICL) with deep reinforcement learning (DRL) techniques to facilitate mobile robot navigation through task-based human instruction. By adopting a curriculum that mirrors the progressive complexity encountered in human learning, our approach systematically enhances robots' ability to interpret and execute complex instructions over time. We explore the principles of DRL and its synergy with ICL, demonstrating how this combination not only improves training efficiency but also equips mobile robots with the generalization capability required for navigating through dynamic indoor environments. Empirical results indicate that robots trained with our ICL-enhanced DRL framework outperform those trained without curriculum learning, highlighting the benefits of structured learning progressions in robotic training.

Mobile Robots through Task-Based Human Instructions using Incremental Curriculum Learning

TL;DR

The paper tackles enabling mobile robots to follow task-based human instructions in indoor environments by integrating incremental curriculum learning with deep reinforcement learning. It proposes a Multimodal Deep Q Network that fuses RGB observations with textual goals within the AI2-THOR simulator, and designs a staged curriculum by decomposing instructions and embedding words with GloVe. Key findings show that incremental curriculum learning improves task accomplishment and generalization, with a sensitivity analysis guiding hyperparameter choices and indicating that reward shaping may be less critical when a structured curriculum is used. This work advances instruction-driven robot navigation and highlights avenues for future improvements in attention-based text processing and broader curricula for unseen instructions.

Abstract

This paper explores the integration of incremental curriculum learning (ICL) with deep reinforcement learning (DRL) techniques to facilitate mobile robot navigation through task-based human instruction. By adopting a curriculum that mirrors the progressive complexity encountered in human learning, our approach systematically enhances robots' ability to interpret and execute complex instructions over time. We explore the principles of DRL and its synergy with ICL, demonstrating how this combination not only improves training efficiency but also equips mobile robots with the generalization capability required for navigating through dynamic indoor environments. Empirical results indicate that robots trained with our ICL-enhanced DRL framework outperform those trained without curriculum learning, highlighting the benefits of structured learning progressions in robotic training.
Paper Structure (16 sections, 2 equations, 10 figures, 2 tables)

This paper contains 16 sections, 2 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Two examples of AI2-THOR environment simulator available for learning navigation based on human instructions. Our robot model aims to efficiently navigate and accomplish tasks with minimal steps.
  • Figure 2: Bird's-eye view kitchen room with variety of objects in AI2-THOR environment. The floor can also be identified during the navigation.
  • Figure 3: Snapshot of the mobile robot simulated within the AI2-THOR framework.
  • Figure 4: Our robot model observes RGB visual image data. Within a single observation, multiple objects in the kitchen room can be identified by the agent.
  • Figure 5: Training results from each stage indicate that our model successfully learns to execute the instructions after completing four stages of distinct training processes.
  • ...and 5 more figures