Semantic-Preserving Feature Partitioning for Multi-View Ensemble Learning

Mohammad Sadegh Khorshidi; Navid Yazdanjue; Hassan Gharoun; Danial Yazdani; Mohammad Reza Nikoo; Fang Chen; Amir H. Gandomi

Semantic-Preserving Feature Partitioning for Multi-View Ensemble Learning

Mohammad Sadegh Khorshidi, Navid Yazdanjue, Hassan Gharoun, Danial Yazdani, Mohammad Reza Nikoo, Fang Chen, Amir H. Gandomi

TL;DR

This paper introduces Semantic-Preserving Feature Partitioning (SPFP), an information-theoretic method to construct semantically coherent artificial views for multi-view ensemble learning (MEL) from a single data source. By modifying the conditional likelihood framework (CLF) and defining a SPFP objective with $ abla$ coefficients and a stopping rule based on entropy and mutual information, SPFP can partition features into multiple views that preserve the information content of the full feature set. The method is validated on eight real-world datasets using XGBoost and Logistic Regression, showing that SPFP-generated views and their ensembles often improve accuracy and reduce predictive uncertainty, while offering computational efficiency through dimensionality reduction. Statistical analyses (Friedman, Conover, and Cliff's delta) reveal significant differences with large effect sizes, supporting the practical value of semantic view construction in MEL, though gains vary by dataset and model complexity. Overall, SPFP provides a rigorous, scalable approach to view construction that balances information preservation and computational cost, with potential extensions to unsupervised settings.

Abstract

In machine learning, the exponential growth of data and the associated ``curse of dimensionality'' pose significant challenges, particularly with expansive yet sparse datasets. Addressing these challenges, multi-view ensemble learning (MEL) has emerged as a transformative approach, with feature partitioning (FP) playing a pivotal role in constructing artificial views for MEL. Our study introduces the Semantic-Preserving Feature Partitioning (SPFP) algorithm, a novel method grounded in information theory. The SPFP algorithm effectively partitions datasets into multiple semantically consistent views, enhancing the MEL process. Through extensive experiments on eight real-world datasets, ranging from high-dimensional with limited instances to low-dimensional with high instances, our method demonstrates notable efficacy. It maintains model accuracy while significantly improving uncertainty measures in scenarios where high generalization performance is achievable. Conversely, it retains uncertainty metrics while enhancing accuracy where high generalization accuracy is less attainable. An effect size analysis further reveals that the SPFP algorithm outperforms benchmark models by large effect size and reduces computational demands through effective dimensionality reduction. The substantial effect sizes observed in most experiments underscore the algorithm's significant improvements in model performance.

Semantic-Preserving Feature Partitioning for Multi-View Ensemble Learning

TL;DR

coefficients and a stopping rule based on entropy and mutual information, SPFP can partition features into multiple views that preserve the information content of the full feature set. The method is validated on eight real-world datasets using XGBoost and Logistic Regression, showing that SPFP-generated views and their ensembles often improve accuracy and reduce predictive uncertainty, while offering computational efficiency through dimensionality reduction. Statistical analyses (Friedman, Conover, and Cliff's delta) reveal significant differences with large effect sizes, supporting the practical value of semantic view construction in MEL, though gains vary by dataset and model complexity. Overall, SPFP provides a rigorous, scalable approach to view construction that balances information preservation and computational cost, with potential extensions to unsupervised settings.

Abstract

Paper Structure (14 sections, 2 theorems, 22 equations, 41 figures, 20 tables, 1 algorithm)

This paper contains 14 sections, 2 theorems, 22 equations, 41 figures, 20 tables, 1 algorithm.

Introduction
Background
Feature Partitioning Methods
Information-based Feature Selection Methods
Proposed Semantic-Preserving Feature Partitioning Method
Objective Function
Stopping Criteria
Prposed SPFP Algorithm
Conditional Independence Assumption in MVL
Experiments
Data Description
Experimental Setup
Results and Analysis
Conclusion

Key Result

Theorem 3.1

For two sets of features, $F$ and $S$, where $S \subset F$, the entropy of the entire feature set $F$, i.e., $H(F)$, is always greater than or equal to the entropy of the subset $S$, $H(S)$.

Figures (41)

Figure 1: The number of common features among the artificial views generated by the SPFP algorithm, with parameters $N_\theta = 5$, $N_F = 0.1 \times | F |$ and $r = 0.6$.
Figure 2: Overview of feature diversity results using the SPFP algorithm over 30 Runs. The figure displays, for each view, the average (mean) number of features selected per run (orange bars), the total count of unique features selected across all runs (green bars), and the number of common features across every run (blue bars).
Figure 3: The figure illustrates the best performing XGBoost models in comparison to the benchmark model, with the segments representing the 95% confidence interval of Cliff's $\delta$ (center point). Grey segments indicate cases where either $P_{fr}>0.05$ or $P_{cn}>0.05$, suggesting no significant difference from the benchmark. Blue segments denote instances where the model outperforms the benchmark ($P_{fr}<0.05$, $P_{cn}<0.05$, and $\delta>0$), while red segments indicate that the benchmark model outperforms the corresponding XGBoost model ($P_{fr}<0.05$, $P_{cn}<0.05$, and $\delta<0$).
Figure 4: The figure illustrates the best performing LR models in comparison to the benchmark model, with the segments representing the 95% confidence interval of Cliff's $\delta$ (center point). Grey segments indicate cases where either $P_{fr}>0.05$ or $P_{cn}>0.05$, suggesting no significant difference from the benchmark. Blue segments denote instances where the model outperforms the benchmark ($P_{fr}<0.05$, $P_{cn}<0.05$, and $\delta>0$), while red segments indicate that the benchmark model outperforms the corresponding LR model ($P_{fr}<0.05$, $P_{cn}<0.05$, and $\delta<0$).
Figure S.6: The raincloud plot of $F_1$ score results obtained from 30 XGBoost runs.
...and 36 more figures

Theorems & Definitions (4)

Theorem 3.1
proof
Theorem 3.2
proof

Semantic-Preserving Feature Partitioning for Multi-View Ensemble Learning

TL;DR

Abstract

Semantic-Preserving Feature Partitioning for Multi-View Ensemble Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (41)

Theorems & Definitions (4)