Table of Contents
Fetching ...

Learning Symbolic Task Representation from a Human-Led Demonstration: A Memory to Store, Retrieve, Consolidate, and Forget Experiences

Luca Buoncompagni, Fulvio Mastrogiovanni

TL;DR

The paper tackles the challenge of deriving intelligible, symbolic task representations from minimal human input by proposing a memory-inspired framework that stores, retrieves, consolidates, and forgets experiences using SIT built on fuzzy Description Logic and the fuzzyDL reasoner. It enables online, one-shot learning from non-annotated demonstrations, producing a hierarchical task representation that can be refined through interaction. The main contributions include formalizing a memory-capable SIT framework, implementing consolidation and forgetting via score-based heuristics, and demonstrating online knowledge bootstrapping on a table-assembly scenario with persistent branches and discarded transient observations. The work highlights both the feasibility of online, interpretable symbolic learning for human-robot collaboration and the scalability limitations inherent in DL-based reasoning, pointing to future work on heuristic exploration and multi-task demonstrations for broader applicability.

Abstract

We present a symbolic learning framework inspired by cognitive-like memory functionalities (i.e., storing, retrieving, consolidating and forgetting) to generate task representations to support high-level task planning and knowledge bootstrapping. We address a scenario involving a non-expert human, who performs a single task demonstration, and a robot, which online learns structured knowledge to re-execute the task based on experiences, i.e., observations. We consider a one-shot learning process based on non-annotated data to store an intelligible representation of the task, which can be refined through interaction, e.g., via verbal or visual communication. Our general-purpose framework relies on fuzzy Description Logic, which has been used to extend the previously developed Scene Identification and Tagging algorithm. In this paper, we exploit such an algorithm to implement cognitive-like memory functionalities employing scores that rank memorised observations over time based on simple heuristics. Our main contribution is the formalisation of a framework that can be used to systematically investigate different heuristics for bootstrapping hierarchical knowledge representations based on robot observations. Through an illustrative assembly task scenario, the paper presents the performance of our framework to discuss its benefits and limitations.

Learning Symbolic Task Representation from a Human-Led Demonstration: A Memory to Store, Retrieve, Consolidate, and Forget Experiences

TL;DR

The paper tackles the challenge of deriving intelligible, symbolic task representations from minimal human input by proposing a memory-inspired framework that stores, retrieves, consolidates, and forgets experiences using SIT built on fuzzy Description Logic and the fuzzyDL reasoner. It enables online, one-shot learning from non-annotated demonstrations, producing a hierarchical task representation that can be refined through interaction. The main contributions include formalizing a memory-capable SIT framework, implementing consolidation and forgetting via score-based heuristics, and demonstrating online knowledge bootstrapping on a table-assembly scenario with persistent branches and discarded transient observations. The work highlights both the feasibility of online, interpretable symbolic learning for human-robot collaboration and the scalability limitations inherent in DL-based reasoning, pointing to future work on heuristic exploration and multi-task demonstrations for broader applicability.

Abstract

We present a symbolic learning framework inspired by cognitive-like memory functionalities (i.e., storing, retrieving, consolidating and forgetting) to generate task representations to support high-level task planning and knowledge bootstrapping. We address a scenario involving a non-expert human, who performs a single task demonstration, and a robot, which online learns structured knowledge to re-execute the task based on experiences, i.e., observations. We consider a one-shot learning process based on non-annotated data to store an intelligible representation of the task, which can be refined through interaction, e.g., via verbal or visual communication. Our general-purpose framework relies on fuzzy Description Logic, which has been used to extend the previously developed Scene Identification and Tagging algorithm. In this paper, we exploit such an algorithm to implement cognitive-like memory functionalities employing scores that rank memorised observations over time based on simple heuristics. Our main contribution is the formalisation of a framework that can be used to systematically investigate different heuristics for bootstrapping hierarchical knowledge representations based on robot observations. Through an illustrative assembly task scenario, the paper presents the performance of our framework to discuss its benefits and limitations.
Paper Structure (21 sections, 7 equations, 4 figures, 2 tables, 2 algorithms)

This paper contains 21 sections, 7 equations, 4 figures, 2 tables, 2 algorithms.

Figures (4)

  • Figure 1: An example of input facts (on the right-hand side) required to process the spatial scene $\epsilon_1$ (on the left-hand side). Facts can be encoded by SIT to perform the functionalities shown in Table \ref{['tab:SITphases']}.
  • Figure 2: The fuzzy cardinality restriction $^{a}\Omega(k)$. If the cardinality ${c_{zh} \geqslant k}$, the restriction is satisfied with fuzzy degree ${p_{zh} = 1}$. If ${c_{zh} \leqslant k^-}$, it is not satisfied. Otherwise, it is fuzzily satisfied with ${p_{zh} \in (0,1)}$. We define ${k^{-} = k (1-a),\,a{\in}[0,1]}$.
  • Figure 3: Examples of scenes that are observed, learned and classified in the memory over time. When the $\epsilon_1$ (Figure \ref{['fig:sceneEx']}), $\epsilon_2$ (\ref{['fig:exScene2']}) and $\epsilon_3$ (\ref{['fig:exScene3']}) scenes have been perceived, SIT learns and structures in the memory $M_t$ (\ref{['fig:exGraph']}) the categories $\Phi_1$, $\Phi_2$ and $\Phi_3$, respectively. Each category is computed with $a = 0.5$, and it has a score (i.e., $q_1$, $q_2$ and $q_3$). Then, when the $\epsilon_4$ scene (\ref{['fig:exScene4']}) is perceived, it will be classified in a sub-graph of $M_t$, i.e., the classification graph $M^\star$, which nodes are highlighted in (\ref{['fig:exGraph']}). Table (\ref{['fig:exTable']}) shows the classification degree $p^{{\epsilon_4{:}\Phi_j}\xspace}$ and similarity value $d^{\Phi_j}_{\epsilon_4}$ associates for each $j\text{-th}$ node in $M_t$ except for $\Phi$, which is the root of the graph and it categorises an empty scene. Since $p^{{\epsilon_4{:}\Phi_2}\xspace} = 0$, $\Phi_3$ is not a node of $M^\star$ and, therefore, $d^{\Phi_3}_{\epsilon_4}$ is undefined.
  • Figure 4: The demonstration of a table assembly task and the bootstrapped representation. Figures (\ref{['fig:memResult1']}-\ref{['fig:memResult9']}) show some of the scenes perceived during the demonstration, and Figure (\ref{['fig:memResultGraph']}) shows the representation bootstrapped in the robot's memory when the demonstration ended. For clarity, the figures denote scenes $\epsilon$ and categories $\Phi$ with an index related to the time instant of the demonstration. Also, each scene category encodes a symbolic representation of the related scene (i.e., how many legs have been connected to the table), which is not shown in the figures. Note that the grey categories in the memory graph have been forgotten, while the other nodes show the scenes that have been consolidated, i.e., the persistent scenes and sub-scenes that are relevant for bootstrapping a representation of the assembly task.