Table of Contents
Fetching ...

Saliency Map-Guided Knowledge Discovery for Subclass Identification with LLM-Based Symbolic Approximations

Tim Bohne, Anne-Kathrin Patricia Windler, Martin Atzmueller

TL;DR

This paper addresses discovering latent subclasses in multiclass time series by exploiting gradient-based saliency maps from trained networks to guide knowledge discovery. It introduces a seven-step neuro-symbolic pipeline that clusters time series using both signals and saliency maps, then uses an LLM to translate cluster centroids into symbolic descriptions and perform fuzzy KG matching against a sensor-fault ontology. Across three UCR datasets, multivariate (signal+saliency) clustering substantially improves clustering metrics and subclass discovery compared to using signals alone, demonstrating a strong link between subsymbolic patterns and symbolic knowledge representations. The work advances bidirectional neuro-symbolic reasoning by enabling knowledge graph-based interpretation of learned patterns, with practical implications for sensor fault analysis and domain knowledge integration.

Abstract

This paper proposes a novel neuro-symbolic approach for sensor signal-based knowledge discovery, focusing on identifying latent subclasses in time series classification tasks. The approach leverages gradient-based saliency maps derived from trained neural networks to guide the discovery process. Multiclass time series classification problems are transformed into binary classification problems through label subsumption, and classifiers are trained for each of these to yield saliency maps. The input signals, grouped by predicted class, are clustered under three distinct configurations. The centroids of the final set of clusters are provided as input to an LLM for symbolic approximation and fuzzy knowledge graph matching to discover the underlying subclasses of the original multiclass problem. Experimental results on well-established time series classification datasets demonstrate the effectiveness of our saliency map-driven method for knowledge discovery, outperforming signal-only baselines in both clustering and subclass identification.

Saliency Map-Guided Knowledge Discovery for Subclass Identification with LLM-Based Symbolic Approximations

TL;DR

This paper addresses discovering latent subclasses in multiclass time series by exploiting gradient-based saliency maps from trained networks to guide knowledge discovery. It introduces a seven-step neuro-symbolic pipeline that clusters time series using both signals and saliency maps, then uses an LLM to translate cluster centroids into symbolic descriptions and perform fuzzy KG matching against a sensor-fault ontology. Across three UCR datasets, multivariate (signal+saliency) clustering substantially improves clustering metrics and subclass discovery compared to using signals alone, demonstrating a strong link between subsymbolic patterns and symbolic knowledge representations. The work advances bidirectional neuro-symbolic reasoning by enabling knowledge graph-based interpretation of learned patterns, with practical implications for sensor fault analysis and domain knowledge integration.

Abstract

This paper proposes a novel neuro-symbolic approach for sensor signal-based knowledge discovery, focusing on identifying latent subclasses in time series classification tasks. The approach leverages gradient-based saliency maps derived from trained neural networks to guide the discovery process. Multiclass time series classification problems are transformed into binary classification problems through label subsumption, and classifiers are trained for each of these to yield saliency maps. The input signals, grouped by predicted class, are clustered under three distinct configurations. The centroids of the final set of clusters are provided as input to an LLM for symbolic approximation and fuzzy knowledge graph matching to discover the underlying subclasses of the original multiclass problem. Experimental results on well-established time series classification datasets demonstrate the effectiveness of our saliency map-driven method for knowledge discovery, outperforming signal-only baselines in both clustering and subclass identification.

Paper Structure

This paper contains 12 sections, 6 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Minimalistic Sensor Fault Ontology
  • Figure 2: Multivariate Clusters InsectWingbeatSound (class $0$)
  • Figure 3: Multivariate Clusters Mallat (class $0$)
  • Figure 4: Centroids for UWaveGestureLibraryAll (Class $1$)
  • Figure 5: Centroids for Mallat (Class $1$)