View selection in multi-view stacking: Choosing the meta-learner
Wouter van Loon, Marjolein Fokkema, Botond Szabo, Mark de Rooij
TL;DR
This study evaluates how the choice of meta-learner in multi-view stacking (MVS) affects both view selection and predictive accuracy across simulations and two gene-expression datasets. Seven nonnegative meta-learners are compared, including the interpolating predictor, ridge, elastic net, lasso, adaptive lasso, stability selection, and nonnegative forward selection, with a base-learner trained on each view and predictions aggregated via cross-validated Z matrices. Results show that nonnegative lasso, nonnegative adaptive lasso, nonnegative elastic net, and NNFS consistently balance sparsity and accuracy, while ridge, stability selection, and the interpolating predictor can underperform, especially in high-dimensional or highly correlated settings; elastic net is preferable when correlated views are present, and lasso yields the sparsest solutions. In real data, the lasso often achieves the best accuracy with minimal views, whereas elastic net and ridge may improve AUC or H-measure at the cost of selecting more views, highlighting a practical trade-off between predictive performance and interpretability. Overall, the findings provide actionable guidance for selecting meta-learners in MVS to obtain accurate, sparse, and interpretable view selections in high-dimensional multi-view problems, including genomics and similar biomedical applications.
Abstract
Multi-view stacking is a framework for combining information from different views (i.e. different feature sets) describing the same set of objects. In this framework, a base-learner algorithm is trained on each view separately, and their predictions are then combined by a meta-learner algorithm. In a previous study, stacked penalized logistic regression, a special case of multi-view stacking, has been shown to be useful in identifying which views are most important for prediction. In this article we expand this research by considering seven different algorithms to use as the meta-learner, and evaluating their view selection and classification performance in simulations and two applications on real gene-expression data sets. Our results suggest that if both view selection and classification accuracy are important to the research at hand, then the nonnegative lasso, nonnegative adaptive lasso and nonnegative elastic net are suitable meta-learners. Exactly which among these three is to be preferred depends on the research context. The remaining four meta-learners, namely nonnegative ridge regression, nonnegative forward selection, stability selection and the interpolating predictor, show little advantages in order to be preferred over the other three.
