LCEN: A Nonlinear, Interpretable Feature Selection and Machine Learning Algorithm

Pedro Seber; Richard D. Braatz

LCEN: A Nonlinear, Interpretable Feature Selection and Machine Learning Algorithm

Pedro Seber, Richard D. Braatz

TL;DR

LCEN addresses the need for nonlinear, interpretable, and sparse feature selection. It integrates a LASSO-based expansion, two clip steps, and elastic-net fitting to produce sparse, accurate models, capable of rediscovering physical laws from data. Across artificial and real datasets, LCEN demonstrates robustness to noise, multicollinearity, and data scarcity, often matching or surpassing dense nonlinear methods while maintaining interpretability and faster runtimes than comparable thresholded EN approaches. The approach shows practical value for critical domains and offers clear avenues for extension to classification and physics-guided modeling.

Abstract

Interpretable models can have advantages over black-box models, and interpretability is essential for the application of machine learning in critical settings, such as aviation or medicine. This article introduces the LASSO-Clip-EN (LCEN) algorithm for nonlinear, interpretable feature selection and machine learning modeling. In a wide variety of artificial and empirical datasets, LCEN constructed sparse and frequently more accurate models than other methods, including sparse, nonlinear methods, on tested datasets. LCEN was empirically observed to be robust against many issues typically present in datasets and modeling, including noise, multicollinearity, and data scarcity. As a feature selection algorithm, LCEN matched or surpassed the thresholded elastic net but was, on average, 10.3-fold faster based on our experiments. LCEN for feature selection can also rediscover multiple physical laws from empirical data. As a machine learning algorithm, when tested on processes with no known physical laws, LCEN achieved better results than many other dense and sparse methods -- including being comparable to or better than ANNs on multiple datasets.

LCEN: A Nonlinear, Interpretable Feature Selection and Machine Learning Algorithm

TL;DR

Abstract

Paper Structure (20 sections, 10 figures, 17 tables, 1 algorithm)

This paper contains 20 sections, 10 figures, 17 tables, 1 algorithm.

Introduction
Methods
The LCEN algorithm
LCEN is optimal when compared to ablated and variant algorithms
Experimental setup
Results
In artificial datasets, LCEN provides high feature selection performance and robustness to noise and multicollinearity
In empirical datasets, LCEN provides higher predictive performance than many other methods
Discussion
Appendix --- Feature Expansion Algorithm
Appendix --- Description of datasets used in this work
Appendix --- List of hyperparameters used in this work
Appendix --- Ablation tests
Appendix --- Additional results with artificial data
"Artificial Linear" datasets
...and 5 more sections

Figures (10)

Figure 1: Test set median MSE for the "4th-degree, univariate polynomial" dataset. ALVEN results (left, reproduced from Sun-and-Braatz-2021 with permission) show that the error is monotonically increasing with noise and that the degree 4 "unbiased model" is the best at low noise levels, but is displaced by the degree 2 "biased model" at higher noise levels. On the other hand, LCEN results (right) show that the median errors converge at higher noises. Furthermore, the LCEN median errors are typically over 60% smaller than the ALVEN median errors, and the degree 4 "unbiased model" is always the best model no matter the noise. The "noise level" and "Noise variance $\sigma^2$" terms are equivalent in this figure. Fig. \ref{['SPA_comparison_interquartile']} contains interquartile ranges for the LCEN model's test MSEs.
Figure A1: Plots of the Matthews Correlation Coefficients (MCCs) for models tested on the "Artificial Linear" dataset with 0% noise and 25% additional false features, as written in each subfigure's title.
Figure A2: Plots of the Matthews Correlation Coefficients (MCCs) for models tested on the "Artificial Linear" dataset with 0% noise and 50% additional false features, as written in each subfigure's title.
Figure A3: Plots of the Matthews Correlation Coefficients (MCCs) for models tested on the "Artificial Linear" dataset with 0% noise and 75% additional false features, as written in each subfigure's title.
Figure A4: Plots of the Matthews Correlation Coefficients (MCCs) for models tested on the "Artificial Linear" dataset with 0% noise and 100% additional false features, as written in each subfigure's title.
...and 5 more figures

LCEN: A Nonlinear, Interpretable Feature Selection and Machine Learning Algorithm

TL;DR

Abstract

LCEN: A Nonlinear, Interpretable Feature Selection and Machine Learning Algorithm

Authors

TL;DR

Abstract

Table of Contents

Figures (10)