Table of Contents
Fetching ...

Incremental Bootstrapping and Classification of Structured Scenes in a Fuzzy Ontology

Luca Buoncompagni, Fulvio Mastrogiovanni

TL;DR

This work extends the Scene Identification and Tagging (SIT) framework from crisp OWL-DL ontologies to a fuzzy Description Logic (FuzzyDL) setting to address perception noise in incremental scene bootstrapping. By encoding scenes with fuzzy cardinalities using a sigma-count approach and left-shoulder membership functions, the fuzzy SIT algorithm learns, structures, and classifies scenes while maintaining the core phases of the original approach. The results show that fuzzy SIT improves robustness to noisy observations and preserves the spirit of incremental learning, though it trades off some intelligibility of the bootstrapped knowledge and incurs higher reasoning complexity. The work demonstrates a viable path toward robust, human-interpretable, ontology-grounded scene representations suitable for assistive robotics, while highlighting ongoing challenges in explanation and computational scalability.

Abstract

We foresee robots that bootstrap knowledge representations and use them for classifying relevant situations and making decisions based on future observations. Particularly for assistive robots, the bootstrapping mechanism might be supervised by humans who should not repeat a training phase several times and should be able to refine the taught representation. We consider robots that bootstrap structured representations to classify some intelligible categories. Such a structure should be incrementally bootstrapped, i.e., without invalidating the identified category models when a new additional category is considered. To tackle this scenario, we presented the Scene Identification and Tagging (SIT) algorithm, which bootstraps structured knowledge representation in a crisp OWL-DL ontology. Over time, SIT bootstraps a graph representing scenes, sub-scenes and similar scenes. Then, SIT can classify new scenes within the bootstrapped graph through logic-based reasoning. However, SIT has issues with sensory data because its crisp implementation is not robust to perception noises. This paper presents a reformulation of SIT within the fuzzy domain, which exploits a fuzzy DL ontology to overcome the robustness issues. By comparing the performances of fuzzy and crisp implementations of SIT, we show that fuzzy SIT is robust, preserves the properties of its crisp formulation, and enhances the bootstrapped representations. On the contrary, the fuzzy implementation of SIT leads to less intelligible knowledge representations than the one bootstrapped in the crisp domain.

Incremental Bootstrapping and Classification of Structured Scenes in a Fuzzy Ontology

TL;DR

This work extends the Scene Identification and Tagging (SIT) framework from crisp OWL-DL ontologies to a fuzzy Description Logic (FuzzyDL) setting to address perception noise in incremental scene bootstrapping. By encoding scenes with fuzzy cardinalities using a sigma-count approach and left-shoulder membership functions, the fuzzy SIT algorithm learns, structures, and classifies scenes while maintaining the core phases of the original approach. The results show that fuzzy SIT improves robustness to noisy observations and preserves the spirit of incremental learning, though it trades off some intelligibility of the bootstrapped knowledge and incurs higher reasoning complexity. The work demonstrates a viable path toward robust, human-interpretable, ontology-grounded scene representations suitable for assistive robotics, while highlighting ongoing challenges in explanation and computational scalability.

Abstract

We foresee robots that bootstrap knowledge representations and use them for classifying relevant situations and making decisions based on future observations. Particularly for assistive robots, the bootstrapping mechanism might be supervised by humans who should not repeat a training phase several times and should be able to refine the taught representation. We consider robots that bootstrap structured representations to classify some intelligible categories. Such a structure should be incrementally bootstrapped, i.e., without invalidating the identified category models when a new additional category is considered. To tackle this scenario, we presented the Scene Identification and Tagging (SIT) algorithm, which bootstraps structured knowledge representation in a crisp OWL-DL ontology. Over time, SIT bootstraps a graph representing scenes, sub-scenes and similar scenes. Then, SIT can classify new scenes within the bootstrapped graph through logic-based reasoning. However, SIT has issues with sensory data because its crisp implementation is not robust to perception noises. This paper presents a reformulation of SIT within the fuzzy domain, which exploits a fuzzy DL ontology to overcome the robustness issues. By comparing the performances of fuzzy and crisp implementations of SIT, we show that fuzzy SIT is robust, preserves the properties of its crisp formulation, and enhances the bootstrapped representations. On the contrary, the fuzzy implementation of SIT leads to less intelligible knowledge representations than the one bootstrapped in the crisp domain.
Paper Structure (22 sections, 16 equations, 9 figures, 1 table, 1 algorithm)

This paper contains 22 sections, 16 equations, 9 figures, 1 table, 1 algorithm.

Figures (9)

  • Figure 1: Examples of input facts that represent two scenes. The description of the elements types (on top) is in common for both scenes.
  • Figure 2: Figure (a) shows the membership function $\Omega$, which defines a minimal cardinality restriction to $k$. Given a fuzziness value $a\in[0,1]$ and a cardinality $c_{zsh}\in\mathbb{R}^+$, ${^{a}\Omega}(k)$ specifies the fuzzy degree $p^{{{\epsilon_t}{:}{\Phi_j}}}_{zsh}$ that represents if $c_{zsh}$ satisfies the restriction. The other figures show two cardinality restrictions $\Omega_1$ (in red) and $\Omega_2$ (in green), which are associated to a $zsh$-th beliefs combination of the learned category $\Phi_1$ and $\Phi_2$, respectively. Figures (b), (c) and (d) show the implication of $\Omega_1$ and $\Omega_2$ (in blue), which minimum value is the subsumption degree$p^{\Phi_1\sqsubseteq\Phi_2}_{zsh}$, and it is used to identify implications between scenes categories, i.e., the structure of bootstrapped knowledge into a graph. In particular, (b) shows the case where the minimal cardinality restriction $\Omega_2$ always implies $\Omega_1$ (i.e., $\Omega_2$ respects the restriction $\Omega_1$), while (d) shows the case where $\Omega_2$ never implies $\Omega_2$, and (c) show the case where the implication has a fuzzy degree in (0,1).
  • Figure 3: The fuzzy kernel defining a ${\langle{{({\gamma_x},{\gamma_y}){:}{\text{ {front}}}}},{\,p_{iz}}\rangle}$ axiom \ref{['eq:in']}. Given an element of interest $\gamma_x$ located in (0,0), the kernel provides the degree $p_{iz}$ based on the relative position of the $\gamma_y$ element. The kernel is oriented based on a global reference frame to discriminate front and right relations.
  • Figure 4: Scenes perceived as noisy symbolic facts over time, the representation that fuzzy SIT bootstrapped, and the classification of new scenes. Figures (a)--(c) shows a sequence of scenes $\epsilon_t$ that had been consequently learned as categories$\Phi_t$. Scene categories were structured in the memory graph $M_t$ for different fuzziness values as shown in (f) $a{=}0.3$, and (g) $a{=}0.7$. The graph (f) also highlights the classification graph$M^\star_6$ obtained when $\epsilon_1$ (a) was arranged again but with small differences (i.e., $\epsilon_6{\thickapprox}\epsilon_1$) at $t{=}6$, while (g) highlights $M^\star_6$ when a new scene $\epsilon_6{\thickapprox}\epsilon_4$ was rearranged. Table (h) details the fuzzy classification degrees $p^{{{\epsilon_6}{:}{\Phi_j}}}$ and similarity values $d^{\Phi_j}_{\epsilon_6}$ that was encoded in the $j$-th nodes of $M^\star_6$ with the two possible new scenes $\epsilon_6$. Since $\epsilon_6$ was always classified with high values in some $\Phi_j$, no category $\Phi_6$ have been learned at $t{=}6$, i.e., $M_6$ had the same structure of $M_5$.
  • Figure 5: The fuzzy classification degree distribution within the scenario introduced in Figure \ref{['fig:tennisGlass']}. Each plot shows the classification degree $p^{{{\epsilon_t}{:}{\Phi_b}}}$ of 2500 scenes $\epsilon_t$ classified in the category$\Phi_b$, which was learned from $\epsilon_b$ (Figure \ref{['fig:tennisGlassBalance']}). In the plots, two spheres refer to the balls shown in Figure \ref{['fig:tennisGlass']}, and the $p^{{{\epsilon_t}{:}{\Phi_b}}}$ degree has been drawn relatively to the position that the glass had in each $\epsilon_t$. The plots exploit the colormap in Figure \ref{['fig:kernel']} and encompass different fuzziness values. For small fuzziness values, only very similar scenes are classified (i.e., $a{=}0$ reduces the classification to the crisp domain) while, for high fuzziness values, this condition is more relaxed. Note that the data shown in these plots is affected by perception noise in terms of object recognition and position.
  • ...and 4 more figures