Table of Contents
Fetching ...

Long-Term Planning Around Humans in Domestic Environments with 3D Scene Graphs

Ermanno Bartoli, Dennis Rotondi, Kai O. Arras, Iolanda Leite

TL;DR

The paper addresses long-term planning for robots in domestic environments by explicitly modeling human activities and their spatial influence via an enriched 3D scene graph (3DSG) extended to include humans and activity-based relations. It proposes a pipeline that extracts a Partial 3DSSG around a planned trajectory, enriches it with human activity context, and uses a Large Language Model (LLM) to assign per-object costs and clearances that modulate trajectory planning. Preliminary findings in a sample scene show that activity-aware relational context reduces inappropriate cost allocation to unoccupied objects and yields more context-sensitive navigation. Future work aims to integrate the cost signals into a full planning pipeline and validate trajectory acceptability through user studies.

Abstract

Long-term planning for robots operating in domestic environments poses unique challenges due to the interactions between humans, objects, and spaces. Recent advancements in trajectory planning have leveraged vision-language models (VLMs) to extract contextual information for robots operating in real-world environments. While these methods achieve satisfying performance, they do not explicitly model human activities. Such activities influence surrounding objects and reshape spatial constraints. This paper presents a novel approach to trajectory planning that integrates human preferences, activities, and spatial context through an enriched 3D scene graph (3DSG) representation. By incorporating activity-based relationships, our method captures the spatial impact of human actions, leading to more context-sensitive trajectory adaptation. Preliminary results demonstrate that our approach effectively assigns costs to spaces influenced by human activities, ensuring that the robot trajectory remains contextually appropriate and sensitive to the ongoing environment. This balance between task efficiency and social appropriateness enhances context-aware human-robot interactions in domestic settings. Future work includes implementing a full planning pipeline and conducting user studies to evaluate trajectory acceptability.

Long-Term Planning Around Humans in Domestic Environments with 3D Scene Graphs

TL;DR

The paper addresses long-term planning for robots in domestic environments by explicitly modeling human activities and their spatial influence via an enriched 3D scene graph (3DSG) extended to include humans and activity-based relations. It proposes a pipeline that extracts a Partial 3DSSG around a planned trajectory, enriches it with human activity context, and uses a Large Language Model (LLM) to assign per-object costs and clearances that modulate trajectory planning. Preliminary findings in a sample scene show that activity-aware relational context reduces inappropriate cost allocation to unoccupied objects and yields more context-sensitive navigation. Future work aims to integrate the cost signals into a full planning pipeline and validate trajectory acceptability through user studies.

Abstract

Long-term planning for robots operating in domestic environments poses unique challenges due to the interactions between humans, objects, and spaces. Recent advancements in trajectory planning have leveraged vision-language models (VLMs) to extract contextual information for robots operating in real-world environments. While these methods achieve satisfying performance, they do not explicitly model human activities. Such activities influence surrounding objects and reshape spatial constraints. This paper presents a novel approach to trajectory planning that integrates human preferences, activities, and spatial context through an enriched 3D scene graph (3DSG) representation. By incorporating activity-based relationships, our method captures the spatial impact of human actions, leading to more context-sensitive trajectory adaptation. Preliminary results demonstrate that our approach effectively assigns costs to spaces influenced by human activities, ensuring that the robot trajectory remains contextually appropriate and sensitive to the ongoing environment. This balance between task efficiency and social appropriateness enhances context-aware human-robot interactions in domestic settings. Future work includes implementing a full planning pipeline and conducting user studies to evaluate trajectory acceptability.

Paper Structure

This paper contains 10 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Overview of the method. Starting from a 3D map and its 3D scene graph representation, our approach computes a preferred based trajectory which is socially aware of the human presence in the scene.
  • Figure 2: Our proposed approach constructs an object-centric description of impact factors, considering both objects and spaces, through a human-centered investigation of the scene. Starting from a 3D map, a trajectory, and a set of preferences, we: (a) extract the Partial 3D Semantic scene graph with objects that could potentially impact the trajectory; (b) enrich the graph by incorporating the human and their activities, integrating with existing nodes into the 3D scene graph; and (c) feed the enriched graph representation, along with the trajectory and preferences, into a Large Language Model (LLM). The LLM then calculates for each object of interest a cost, combined with a clearance value, that describes how the cost decreases with distance.