Table of Contents
Fetching ...

Expected Confidence Dependency: A Novel Rough Set-Based Approach to Feature Selection

Saeed Rasouli, Hamid Karamikabir

TL;DR

This work introduces Expected Confidence Dependency (ECD), a soft-computing, probabilistic generalization of rough-set dependency that weights each conditional equivalence class by its classification confidence. ECD computes an overall dependency Exp(C,D) as the normalized sum of block-wise majority confidences, enabling smooth, partial, and uncertainty-aware feature selection. Theoretical guarantees include normalization, monotonicity, and invariance properties, while experiments on four UCI datasets show that ECD-based forward selection yields more accurate and compact feature subsets than classical, relative, or direct dependency criteria. The approach demonstrates robustness to noise and partial consistency, with practical potential across high-dimensional, heterogeneous data domains. Extensions to incomplete data, scalability improvements, and broader domain applications are identified as promising avenues for future work.

Abstract

This paper proposes Expected Confidence Dependency (ECD), a novel, soft computing-oriented, accuracy driven dependency measure for feature selection within the rough set theory framework. Unlike traditional rough set dependency measures that rely on binary characterizations of conditional blocks, ECD assigns confidence-based contributions to individual equivalence blocks and aggregates them through a normalized expectation operator. We formally establish several desirable properties of ECD, including normalization, compatibility with classical dependency, monotonicity, and invariance under structural and label-preserving transformations.

Expected Confidence Dependency: A Novel Rough Set-Based Approach to Feature Selection

TL;DR

This work introduces Expected Confidence Dependency (ECD), a soft-computing, probabilistic generalization of rough-set dependency that weights each conditional equivalence class by its classification confidence. ECD computes an overall dependency Exp(C,D) as the normalized sum of block-wise majority confidences, enabling smooth, partial, and uncertainty-aware feature selection. Theoretical guarantees include normalization, monotonicity, and invariance properties, while experiments on four UCI datasets show that ECD-based forward selection yields more accurate and compact feature subsets than classical, relative, or direct dependency criteria. The approach demonstrates robustness to noise and partial consistency, with practical potential across high-dimensional, heterogeneous data domains. Extensions to incomplete data, scalability improvements, and broader domain applications are identified as promising avenues for future work.

Abstract

This paper proposes Expected Confidence Dependency (ECD), a novel, soft computing-oriented, accuracy driven dependency measure for feature selection within the rough set theory framework. Unlike traditional rough set dependency measures that rely on binary characterizations of conditional blocks, ECD assigns confidence-based contributions to individual equivalence blocks and aggregates them through a normalized expectation operator. We formally establish several desirable properties of ECD, including normalization, compatibility with classical dependency, monotonicity, and invariance under structural and label-preserving transformations.

Paper Structure

This paper contains 17 sections, 7 theorems, 72 equations, 6 figures, 9 tables, 3 algorithms.

Key Result

Theorem 4.1

Let $K = (U, C \cup D)$ be a decision system. The following properties hold:

Figures (6)

  • Figure 1: Flowchart for computing the ECD.
  • Figure 2: Heatmap of dependency values for selected attribute subsets.
  • Figure 3: Heatmap of performance metrics of feature selection algorithms on the Breast Cancer dataset.
  • Figure 4: Heatmap of performance metrics of feature selection algorithms on the Credit Approval dataset.
  • Figure 5: Heatmap of performance metrics for feature selection algorithms on the Zoo dataset.
  • ...and 1 more figures

Theorems & Definitions (22)

  • Example 2.1
  • Example 2.2
  • Example 2.3
  • Example 2.4
  • Example 2.5
  • Example 2.6
  • Example 2.7
  • Example 3.1
  • Theorem 4.1
  • proof
  • ...and 12 more