Table of Contents
Fetching ...

Task and Motion Planning in Hierarchical 3D Scene Graphs

Aaron Ray, Christopher Bradley, Luca Carlone, Nicholas Roy

TL;DR

This work describes a method for building sparse problem instances which enables scaling planning to large scenes, and proposes a technique for incrementally adding objects to that domain during planning time that minimizes computation on irrelevant elements of the scene graph.

Abstract

Recent work in the construction of 3D scene graphs has enabled mobile robots to build large-scale metric-semantic hierarchical representations of the world. These detailed models contain information that is useful for planning, however an open question is how to derive a planning domain from a 3D scene graph that enables efficient computation of executable plans. In this work, we present a novel approach for defining and solving Task and Motion Planning problems in large-scale environments using hierarchical 3D scene graphs. We describe a method for building sparse problem instances which enables scaling planning to large scenes, and we propose a technique for incrementally adding objects to that domain during planning time that minimizes computation on irrelevant elements of the scene graph. We evaluate our approach in two real scene graphs built from perception, including one constructed from the KITTI dataset. Furthermore, we demonstrate our approach in the real world, building our representation, planning in it, and executing those plans on a real robotic mobile manipulator. A video supplement is available at \url{https://youtu.be/v8fkwLjBn58}.

Task and Motion Planning in Hierarchical 3D Scene Graphs

TL;DR

This work describes a method for building sparse problem instances which enables scaling planning to large scenes, and proposes a technique for incrementally adding objects to that domain during planning time that minimizes computation on irrelevant elements of the scene graph.

Abstract

Recent work in the construction of 3D scene graphs has enabled mobile robots to build large-scale metric-semantic hierarchical representations of the world. These detailed models contain information that is useful for planning, however an open question is how to derive a planning domain from a 3D scene graph that enables efficient computation of executable plans. In this work, we present a novel approach for defining and solving Task and Motion Planning problems in large-scale environments using hierarchical 3D scene graphs. We describe a method for building sparse problem instances which enables scaling planning to large scenes, and we propose a technique for incrementally adding objects to that domain during planning time that minimizes computation on irrelevant elements of the scene graph. We evaluate our approach in two real scene graphs built from perception, including one constructed from the KITTI dataset. Furthermore, we demonstrate our approach in the real world, building our representation, planning in it, and executing those plans on a real robotic mobile manipulator. A video supplement is available at \url{https://youtu.be/v8fkwLjBn58}.
Paper Structure (13 sections, 3 theorems, 1 equation, 5 figures, 1 algorithm)

This paper contains 13 sections, 3 theorems, 1 equation, 5 figures, 1 algorithm.

Key Result

Proposition 1

Consider a feasible planning instance $R = (\mathcal{P}, \mathcal{A}, \mathcal{S}, \mathcal{O}, \mathcal{I}_0, \mathcal{G})$. For a redundant symbol $x \in \mathcal{O}$, we define a related instance $R' = (\mathcal{P}, \mathcal{A}, \mathcal{S}, \mathcal{O}', \mathcal{I}_0', \mathcal{G}')$ where $x$

Figures (5)

  • Figure 1: An illustration of how we derive and encode tasks in our planning representation from a 3D scene graph. (A) An isometric view of a Hydra scene graph generated from the KITTI dataset, giving an insight to the scale of the environment. (B) A simplified version of this scene, where the agent is tasked with either visiting Place 6 while avoiding Place 1, or visiting Place 5. We see that Place 5 is partially obstructed by a suspicious object, so the agent must consider either avoiding it (green trajectory), or inspecting and neutralizing the object (blue trajectory) to reach its goal. (C) A mobile robot (which we used to build our scene graphs) executing a plan in the real world, inspecting an object in the top frame, and moving an obstruction out of its path on the bottom.
  • Figure 2: Our tri-level planner in a real world environment. A scene graph of the fifth floor of building 45 at MIT was built using the Spot robot, from which we extract our planning abstraction for the goal: (and (ObjectAtPlace O27 P909) (VisitedPlace P2700) (Safe O35) (not (VisitedPlace P1153))), which instructs the agent to move O27 to a P909, inspect O35, visit P2700, all while avoiding P1153. At the highest level, the task planner is given a very sparsified version of the scene, as highlighted above. The mid-level planner plans a path through the places guided by the abstract plan found at the highest level, avoiding place P1153. Feedback from this level leads to the addition of O28 to the high-level domain, as O27 would be otherwise unreachable. The low-level planner computes full trajectories, guided by the path found at the mid-level. The plan produced (and executed by the robot) is shown on the right.
  • Figure 3: Three of the maps used for evaluation (not shown is the KITTI or building 45 environments). A narrow alley map, a simple 10x10 grid world map, and a scene graph built from real data collected by a robot in an office environment.
  • Figure 4: A) Comparison of the time to solve tasks of comparable complexity across different environmental scales for the dense formulation and the proposed sparse formulation. B) The scaling of our approach with the complexity of the goal specification in the simple 10x10 grid world. As we increase the number of unique objects in the goal specification, the problem is no longer sparse, and so it no longer benefits from our approach.
  • Figure 5: An example plan from the KITTI environment. The robot begins in the top left, and is tasked with inspecting one object (denoted by the red triangle at the end of the trajectory). Along the way, there are numerous objects potentially blocking the path, so we must add at least one to its planning domain. After inspecting and neutralizing this object, the robot can reach its goal.

Theorems & Definitions (7)

  • Definition 1: Redundant Symbol
  • Proposition 1: Removing Redundant Symbol Preserves Feasibility
  • Proposition 2: Redundant Places
  • remark 1: Problem Initialization
  • Proposition 3: Sufficient Conditions for Ignoring Places
  • Definition 2: V-Extended State Plan
  • Definition 3: Execution consistent