Large language model-based task planning for service robots: A review

Shaohan Bian; Ying Zhang; Guohui Tian; Zhiqiang Miao; Edmond Q. Wu; Simon X. Yang; Changchun Hua

Large language model-based task planning for service robots: A review

Shaohan Bian, Ying Zhang, Guohui Tian, Zhiqiang Miao, Edmond Q. Wu, Simon X. Yang, Changchun Hua

TL;DR

This paper surveys the integration of large language models (LLMs) into service-robot task planning, addressing the challenges of planning in unstructured domestic environments. It develops a modality-centric taxonomy spanning text-based, vision-language, audio-based, and multimodal planning, and reviews foundational LLM techniques (pre-training, fine-tuning, retrieval-augmented generation, and prompting) alongside their robotic applications. Key contributions include structuring literature around input modalities, analyzing core hurdles such as perception gaps, real-time constraints, and multimodal fusion, and proposing directions like benchmarks and embodied intelligence to advance practical deployment. The work provides a consolidated reference for researchers and practitioners seeking to enhance autonomy, safety, and adaptability of service robots through LLM-driven task planning.

Abstract

With the rapid advancement of large language models (LLMs) and robotics, service robots are increasingly becoming an integral part of daily life, offering a wide range of services in complex environments. To deliver these services intelligently and efficiently, robust and accurate task planning capabilities are essential. This paper presents a comprehensive overview of the integration of LLMs into service robotics, with a particular focus on their role in enhancing robotic task planning. First, the development and foundational techniques of LLMs, including pre-training, fine-tuning, retrieval-augmented generation (RAG), and prompt engineering, are reviewed. We then explore the application of LLMs as the cognitive core-`brain'-of service robots, discussing how LLMs contribute to improved autonomy and decision-making. Furthermore, recent advancements in LLM-driven task planning across various input modalities are analyzed, including text, visual, audio, and multimodal inputs. Finally, we summarize key challenges and limitations in current research and propose future directions to advance the task planning capabilities of service robots in complex, unstructured domestic environments. This review aims to serve as a valuable reference for researchers and practitioners in the fields of artificial intelligence and robotics.

Large language model-based task planning for service robots: A review

TL;DR

Abstract

Large language model-based task planning for service robots: A review

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)