Table of Contents
Fetching ...

Lang2Manip: A Tool for LLM-Based Symbolic-to-Geometric Planning for Manipulation

Muhayy Ud Din, Jan Rosell, Waseem Akram, Irfan Hussain

TL;DR

The paper addresses the challenge of generalizing language-conditioned manipulation across diverse robots and planning backends. It proposes Lang2Manip, a modular pipeline that connects an LLM-driven symbolic planner with the Kautham motion-planning framework to enable robot-agnostic symbolic-to-geometric execution. Key contributions include a two-layer architecture, robot-scene integration via URDF/XML, an LLM prompting scheme with a fixed action grammar, and a grounding pipeline through grasp planning, IK, and OMPL-based planning. Experimental results with a Franka Panda in simulation demonstrate competitive task success and planning feasibility, supporting the approach's scalability and versatility for language-driven task and motion planning.

Abstract

Simulation is essential for developing robotic manipulation systems, particularly for task and motion planning (TAMP), where symbolic reasoning interfaces with geometric, kinematic, and physics-based execution. Recent advances in Large Language Models (LLMs) enable robots to generate symbolic plans from natural language, yet executing these plans in simulation often requires robot-specific engineering or planner-dependent integration. In this work, we present a unified pipeline that connects an LLM-based symbolic planner with the Kautham motion planning framework to achieve generalizable, robot-agnostic symbolic-to-geometric manipulation. Kautham provides ROS-compatible support for a wide range of industrial manipulators and offers geometric, kinodynamic, physics-driven, and constraint-based motion planning under a single interface. Our system converts language instructions into symbolic actions and computes and executes collision-free trajectories using any of Kautham's planners without additional coding. The result is a flexible and scalable tool for language-driven TAMP that is generalized across robots, planning modalities, and manipulation tasks.

Lang2Manip: A Tool for LLM-Based Symbolic-to-Geometric Planning for Manipulation

TL;DR

The paper addresses the challenge of generalizing language-conditioned manipulation across diverse robots and planning backends. It proposes Lang2Manip, a modular pipeline that connects an LLM-driven symbolic planner with the Kautham motion-planning framework to enable robot-agnostic symbolic-to-geometric execution. Key contributions include a two-layer architecture, robot-scene integration via URDF/XML, an LLM prompting scheme with a fixed action grammar, and a grounding pipeline through grasp planning, IK, and OMPL-based planning. Experimental results with a Franka Panda in simulation demonstrate competitive task success and planning feasibility, supporting the approach's scalability and versatility for language-driven task and motion planning.

Abstract

Simulation is essential for developing robotic manipulation systems, particularly for task and motion planning (TAMP), where symbolic reasoning interfaces with geometric, kinematic, and physics-based execution. Recent advances in Large Language Models (LLMs) enable robots to generate symbolic plans from natural language, yet executing these plans in simulation often requires robot-specific engineering or planner-dependent integration. In this work, we present a unified pipeline that connects an LLM-based symbolic planner with the Kautham motion planning framework to achieve generalizable, robot-agnostic symbolic-to-geometric manipulation. Kautham provides ROS-compatible support for a wide range of industrial manipulators and offers geometric, kinodynamic, physics-driven, and constraint-based motion planning under a single interface. Our system converts language instructions into symbolic actions and computes and executes collision-free trajectories using any of Kautham's planners without additional coding. The result is a flexible and scalable tool for language-driven TAMP that is generalized across robots, planning modalities, and manipulation tasks.

Paper Structure

This paper contains 24 sections, 2 equations, 6 figures.

Figures (6)

  • Figure 1: Overview of the proposed LLM–Kautham manipulation planning framework. The LLM generates symbolic actions from task, system, and scene descriptions, which are executed via grasp and IK plugins. These actions are passed to the Kautham Project, which handles problem setup and motion planning through its GUI, state textualization module, and multiple planners (physics-based, kinodynamic, geometric, and knowledge-oriented).
  • Figure 2: Example Kautham problem file defining a manipulation scene with the Franka Emika Panda robot. The XML specification includes the robot and obstacle URDF models, their poses in the workspace, the associated control file, planner selection (RRT in this example), planner parameters, and a query block specifying the initial and goal values for each controlled joint.
  • Figure 3: Directory structure used in Kautham for organizing planning scenes. The Kautham_Demos folder contains a Kautham Models Directory with URDF files for robots and obstacles, and a Kautham Problem Directory with problem descriptions and control files required to instantiate a planning scenario.
  • Figure 4: Kautham visualization of a manipulation scene. The left panel shows the standard Kautham viewer displaying the robot and workspace objects, while the right panel illustrates the robot's collision model together with the computed motion plan (red trajectory) and the corresponding exploration tree generated by the RRT planner (green samples).
  • Figure 5: LLM-guided symbolic planning pipeline. The prompt is composed of three components: the user-defined task, the fixed system prompt describing the required output format and action schema, and the textualized environment state obtained from Kautham. The combined prompt is passed to the LLM, which produces a structured JSON plan containing high-level symbolic actions.
  • ...and 1 more figures