Table of Contents
Fetching ...

Semantic-Guided RL for Interpretable Feature Engineering

Mohamed Bouadi, Arta Alavi, Salima Benbernou, Mourad Ouziri

TL;DR

SMART is introduced, a hybrid approach that uses semantic technologies to guide the generation of interpretable features through a two-step process: Exploitation and Exploration, which significantly improves prediction accuracy while ensuring a high level of interpretability.

Abstract

The quality of Machine Learning (ML) models strongly depends on the input data, as such generating high-quality features is often required to improve the predictive accuracy. This process is referred to as Feature Engineering (FE). However, since manual feature engineering is time-consuming and requires case-by-case domain knowledge, Automated Feature Engineering (AutoFE) is crucial. A major challenge that remains is to generate interpretable features. To tackle this problem, we introduce SMART, a hybrid approach that uses semantic technologies to guide the generation of interpretable features through a two-step process: Exploitation and Exploration. The former uses Description Logics (DL) to reason on the semantics embedded in Knowledge Graphs (KG) to infer domain-specific features, while the latter exploits the knowledge graph to conduct a guided exploration of the search space through Deep Reinforcement Learning (DRL). Our experiments on public datasets demonstrate that SMART significantly improves prediction accuracy while ensuring a high level of interpretability.

Semantic-Guided RL for Interpretable Feature Engineering

TL;DR

SMART is introduced, a hybrid approach that uses semantic technologies to guide the generation of interpretable features through a two-step process: Exploitation and Exploration, which significantly improves prediction accuracy while ensuring a high level of interpretability.

Abstract

The quality of Machine Learning (ML) models strongly depends on the input data, as such generating high-quality features is often required to improve the predictive accuracy. This process is referred to as Feature Engineering (FE). However, since manual feature engineering is time-consuming and requires case-by-case domain knowledge, Automated Feature Engineering (AutoFE) is crucial. A major challenge that remains is to generate interpretable features. To tackle this problem, we introduce SMART, a hybrid approach that uses semantic technologies to guide the generation of interpretable features through a two-step process: Exploitation and Exploration. The former uses Description Logics (DL) to reason on the semantics embedded in Knowledge Graphs (KG) to infer domain-specific features, while the latter exploits the knowledge graph to conduct a guided exploration of the search space through Deep Reinforcement Learning (DRL). Our experiments on public datasets demonstrate that SMART significantly improves prediction accuracy while ensuring a high level of interpretability.
Paper Structure (17 sections, 8 equations, 5 figures, 2 tables)

This paper contains 17 sections, 8 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: The different workflows of AutoFE and AutoML.
  • Figure 2: An overview of SMART architecture.
  • Figure 3: An overview of the KGs used by SMART.
  • Figure 4: Feature Importance.
  • Figure 5: Top-10 features of mCAFE (green) and SMART (blue).

Theorems & Definitions (3)

  • Example 1
  • Definition 2.1
  • Definition 2.2