Model Selection Through Model Sorting

Mohammad Ali Hajiani; Babak Seyfe

Model Selection Through Model Sorting

Mohammad Ali Hajiani, Babak Seyfe

TL;DR

It is shown that, the S-NER method without any prior information can outperform the accuracy of feature sorting algorithms like orthogonal matching pursuit (OMP) that aided with prior knowledge of the true model order.

Abstract

We propose a novel approach to select the best model of the data. Based on the exclusive properties of the nested models, we find the most parsimonious model containing the risk minimizer predictor. We prove the existence of probable approximately correct (PAC) bounds on the difference of the minimum empirical risk of two successive nested models, called successive empirical excess risk (SEER). Based on these bounds, we propose a model order selection method called nested empirical risk (NER). By the sorted NER (S-NER) method to sort the models intelligently, the minimum risk decreases. We construct a test that predicts whether expanding the model decreases the minimum risk or not. With a high probability, the NER and S-NER choose the true model order and the most parsimonious model containing the risk minimizer predictor, respectively. We use S-NER model selection in the linear regression and show that, the S-NER method without any prior information can outperform the accuracy of feature sorting algorithms like orthogonal matching pursuit (OMP) that aided with prior knowledge of the true model order. Also, in the UCR data set, the NER method reduces the complexity of the classification of UCR datasets dramatically, with a negligible loss of accuracy.

Model Selection Through Model Sorting

TL;DR

Abstract

Paper Structure (20 sections, 14 theorems, 96 equations, 4 figures, 1 algorithm)

This paper contains 20 sections, 14 theorems, 96 equations, 4 figures, 1 algorithm.

Introduction
Preliminaries
Concept and Definitions
Model Selection in Nested Model Families
Properties of Minimum Risks and Minimum Empirical Risks in Nested Families
Probably Approximately Correct (PAC) Bounds on the SEER
Nested Empirical Risk (NER) Method
Model Sorting and Selection
Model selection in Linear Regression
S-NER model selection in the Linear Regression
SEER PAC Bounds in Linear Regression
Applications
Linear Regression Model Selection Using Synthetic Data
Feature Selection in the Classification of UCR Dataset
Feature Sorting
...and 5 more sections

Key Result

Corollary 1

Let $\{\bar{\mathcal{M}}_k\}_{k=1}^L$ be an arbitrary set of models. Then, let $\mathcal{M}_1=\bar{\mathcal{M}}_1$ and for every $k\in\{1,2,...,L-1\}$, $\mathcal{M}_{k+1}=\mathcal{M}_{k} \cup \bar{\mathcal{M}}_{k+1}$. The model family $\{\mathcal{M}_k\}_{k=1}^L$ is sequentially nested.

Figures (4)

Figure 1: Example of parameter spaces in the S-NER model selection procedure. Dash lines refer to the candidates’ models, and continuous lines refer to the models with the least minimum empirical risk between candidates’ models.
Figure 2: Comparison of true detection probability of the aided S-NER, S-NER model selection method, aided OMP, aided LARS, and EFIC and EBICR methods using OMP and LARS as the feature sorting algorithm for different SNR. In this simulation, the number of observations $n=60$, the number of features $L=205$, and the order of the true model is $K=5$.
Figure 3: Comparison of true detection Probability of the aided S-NER, the S-NER, aided OMP, aided LARS, and EFIC and EBICR methods using OMP and LARS as the feature sorting algorithm for the different number of observations $n$. In this simulation, SNR is $6$ dB, the number of features $L=\lceil n^{1.3}\rceil$, and the order of the true model $K=5$.
Figure 4: Accuracy and scaled kernels number to 9996 of mini-ROCKET versus NER, EFIC, AIC, BIC, and EBICR feature selection in the UCR dataset.

Theorems & Definitions (38)

Definition 1: Nested Models
Definition 2: Partially Nested Models
Definition 3: Non-Nested Models
Definition 4: Sequentially Nested Model Class
Corollary 1: Nesting Process
proof
Corollary 2
proof
Definition 5: Glivenko-Cantelli Function Class wainwright2019high
Lemma 1
...and 28 more

Model Selection Through Model Sorting

TL;DR

Abstract

Model Selection Through Model Sorting

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (38)