AdaptBot: Combining LLM with Knowledge Graphs and Human Input for Generic-to-Specific Task Decomposition and Knowledge Refinement
Shivam Singh, Karthik Swaminathan, Nabanita Dash, Ramandeep Singh, Snehasis Banerjee, Mohan Sridharan, Madhava Krishna
TL;DR
This work tackles the challenge of performing unseen tasks with limited labeled data by combining LLM-driven generic task decomposition with a domain-specific Knowledge Graph (KG) and human-in-the-loop refinement. The framework uses two RDF-based graphs, G_s (state) and G_k (attributes), to check feasibility of LLM-predicted sub-tasks via SPARQL queries and to refine outputs when mismatches occur, with HITL updates expanding the KG. Experimental results in simulated cooking and cleaning tasks show that merging LLM predictions with KG knowledge and selective human feedback yields substantial performance gains over using LLMs alone or LLM+KG, and supports incremental adaptation to new task classes without heavy tuning. This approach enables faster, more reliable deployment of embodied agents in open-set domains by leveraging complementary strengths of LLMs, structured domain knowledge, and user input, with potential extensions to real robots and broader domains.
Abstract
An embodied agent assisting humans is often asked to complete new tasks, and there may not be sufficient time or labeled examples to train the agent to perform these new tasks. Large Language Models (LLMs) trained on considerable knowledge across many domains can be used to predict a sequence of abstract actions for completing such tasks, although the agent may not be able to execute this sequence due to task-, agent-, or domain-specific constraints. Our framework addresses these challenges by leveraging the generic predictions provided by LLM and the prior domain knowledge encoded in a Knowledge Graph (KG), enabling an agent to quickly adapt to new tasks. The robot also solicits and uses human input as needed to refine its existing knowledge. Based on experimental evaluation in the context of cooking and cleaning tasks in simulation domains, we demonstrate that the interplay between LLM, KG, and human input leads to substantial performance gains compared with just using the LLM. Project website§: https://sssshivvvv.github.io/adaptbot/
