Table of Contents
Fetching ...

Inference of Human-derived Specifications of Object Placement via Demonstration

Alex Cuellar, Ho Chit Siu, Julie A Shah

TL;DR

This work addresses the challenge of capturing human preferences for spatial object arrangement by introducing Positionally-Augmented RCC (PARCC), a logic that extends RCC with directional and class-relational capabilities. It provides a CNF-based PARCC formula framework and an inference pipeline that learns specifications from demonstrations, using a two-stage process: discovering disjunctive clauses that all demonstrations satisfy, then assessing which clauses reflect the demonstrator's intent via non-specification demonstrations. A human study shows that specifications inferred from demonstrations ($\Phi_D$) align better with participants' demonstrated patterns than human-provided specifications ($\Phi_S$), and that non-specification demonstrations offer a reasonable surrogate for unlabeled behavior. The approach enables robust, human-aligned planning for object placement in tasks like packing and sorting, with potential integration into automated robotic pipelines where spatial relationships matter for safety and efficiency.

Abstract

As robots' manipulation capabilities improve for pick-and-place tasks (e.g., object packing, sorting, and kitting), methods focused on understanding human-acceptable object configurations remain limited expressively with regard to capturing spatial relationships important to humans. To advance robotic understanding of human rules for object arrangement, we introduce positionally-augmented RCC (PARCC), a formal logic framework based on region connection calculus (RCC) for describing the relative position of objects in space. Additionally, we introduce an inference algorithm for learning PARCC specifications via demonstrations. Finally, we present the results from a human study, which demonstrate our framework's ability to capture a human's intended specification and the benefits of learning from demonstration approaches over human-provided specifications.

Inference of Human-derived Specifications of Object Placement via Demonstration

TL;DR

This work addresses the challenge of capturing human preferences for spatial object arrangement by introducing Positionally-Augmented RCC (PARCC), a logic that extends RCC with directional and class-relational capabilities. It provides a CNF-based PARCC formula framework and an inference pipeline that learns specifications from demonstrations, using a two-stage process: discovering disjunctive clauses that all demonstrations satisfy, then assessing which clauses reflect the demonstrator's intent via non-specification demonstrations. A human study shows that specifications inferred from demonstrations () align better with participants' demonstrated patterns than human-provided specifications (), and that non-specification demonstrations offer a reasonable surrogate for unlabeled behavior. The approach enables robust, human-aligned planning for object placement in tasks like packing and sorting, with potential integration into automated robotic pipelines where spatial relationships matter for safety and efficiency.

Abstract

As robots' manipulation capabilities improve for pick-and-place tasks (e.g., object packing, sorting, and kitting), methods focused on understanding human-acceptable object configurations remain limited expressively with regard to capturing spatial relationships important to humans. To advance robotic understanding of human rules for object arrangement, we introduce positionally-augmented RCC (PARCC), a formal logic framework based on region connection calculus (RCC) for describing the relative position of objects in space. Additionally, we introduce an inference algorithm for learning PARCC specifications via demonstrations. Finally, we present the results from a human study, which demonstrate our framework's ability to capture a human's intended specification and the benefits of learning from demonstration approaches over human-provided specifications.

Paper Structure

This paper contains 17 sections, 9 equations, 5 figures, 1 table, 2 algorithms.

Figures (5)

  • Figure 1: Two example configurations of apples, oranges, and cans in a box.
  • Figure 2: The demonstration interface we used in our experiment. The initial state (left) and completed state (right) is shown.
  • Figure 3: A pipeline showing our human study procedure. Steps directly involving human participation are numbered 1-5. Section \ref{['subsec:experimental_setup']} depicts the final questions given to the human.
  • Figure 4: Pre-generated demonstrations of the box packing environment initially shown to study subjects ($\mathcal{D}_I$).
  • Figure 5: Box and whisker plots of Likert responses. (left) Likert responses to how well $C_D$ matched patterns in subjects' demonstrations (Q1), and how well $C_S$ matched patterns in subjects' demonstrations (Q2). Responses to Q1 were significantly greater than 3 ($p=5.1e-8$), and responses to Q2 were significantly less than 3 ($p=3.8e-7$). (right) Likert responses indicating how well $C_D$ matched patterns in subjects' demonstrations between Groups A and B. Responses did not differ significantly between the two groups.

Theorems & Definitions (4)

  • Definition 3.1: PARCC object relation
  • Definition 3.2: PARCC class relation
  • Definition 3.3: PARCC Formula
  • Definition 3.4: Demonstration