Table of Contents
Fetching ...

Focused Relative Risk Information Criterion for Variable Selection in Linear Regression

Nils Lid Hjort

TL;DR

This paper develops a focused framework for variable selection in linear regression by introducing FRIC, a focused information criterion that compares submodels against a wide model for the specific parameter $μ=E(Y|x_0)$. It provides exact finite-sample FRIC theory using confidence distributions for the relative risks $rr_S$ and extends the approach to multiple focus parameters through AFRIC, enabling aggregated, weights-based model evaluation. The work connects FRIC/AFRIC to classical criteria like AIC and Mallows Cp while offering exact, data-driven confidence assessments and a simple, interpretable decision rule in common diagonal-variance settings. The methods are illustrated on a birthweight dataset, highlighting personalized model ranking, potential for model averaging, and applicability to broader regression settings, including extensions to generalized linear models and non-normal residuals under appropriate conditions.

Abstract

This paper motivates and develops a novel and focused approach to variable selection in linear regression models. For estimating the regression mean $μ=\E\,(Y\midd x_0)$, for the covariate vector of a given individual, there is a list of competing estimators, say $\hattμ_S$ for each submodel $S$. Exact expressions are found for the relative mean squared error risks, when compared to the widest model available, say $\mse_S/\mse_\wide$. The theory of confidence distributions is used for accurate assessments of these relative risks. This leads to certain Focused Relative Risk Information Criterion scores, and associated FRIC plots and FRIC tables, as well as to Confidence plots to exhibit the confidence the data give in the submodels. The machinery is extended to handle many focus parameters at the same time, with appropriate averaged FRIC scores. The particular case where all available covariate vectors have equal importance yields a new overall criterion for variable selection, balancing complexity and fit in a natural fashion. A connection to the Mallows criterion is demonstrated, leading also to natural modifications of the latter. The FRIC and AFRIC strategies are illustrated for real data.

Focused Relative Risk Information Criterion for Variable Selection in Linear Regression

TL;DR

This paper develops a focused framework for variable selection in linear regression by introducing FRIC, a focused information criterion that compares submodels against a wide model for the specific parameter . It provides exact finite-sample FRIC theory using confidence distributions for the relative risks and extends the approach to multiple focus parameters through AFRIC, enabling aggregated, weights-based model evaluation. The work connects FRIC/AFRIC to classical criteria like AIC and Mallows Cp while offering exact, data-driven confidence assessments and a simple, interpretable decision rule in common diagonal-variance settings. The methods are illustrated on a birthweight dataset, highlighting personalized model ranking, potential for model averaging, and applicability to broader regression settings, including extensions to generalized linear models and non-normal residuals under appropriate conditions.

Abstract

This paper motivates and develops a novel and focused approach to variable selection in linear regression models. For estimating the regression mean , for the covariate vector of a given individual, there is a list of competing estimators, say for each submodel . Exact expressions are found for the relative mean squared error risks, when compared to the widest model available, say . The theory of confidence distributions is used for accurate assessments of these relative risks. This leads to certain Focused Relative Risk Information Criterion scores, and associated FRIC plots and FRIC tables, as well as to Confidence plots to exhibit the confidence the data give in the submodels. The machinery is extended to handle many focus parameters at the same time, with appropriate averaged FRIC scores. The particular case where all available covariate vectors have equal importance yields a new overall criterion for variable selection, balancing complexity and fit in a natural fashion. A connection to the Mallows criterion is demonstrated, leading also to natural modifications of the latter. The FRIC and AFRIC strategies are illustrated for real data.
Paper Structure (10 sections, 2 theorems, 64 equations, 5 figures, 1 table)

This paper contains 10 sections, 2 theorems, 64 equations, 5 figures, 1 table.

Key Result

Lemma 1

For estimating $\mu=x_0^{\rm t}\beta$, the submodel based estimator $\widehat{\mu}_S=x_{0,S}^{\rm t}\widehat{\beta}_S$ has bias $\omega_S^{\rm t}\beta_{S^c}$, with $\omega_S=\Sigma_{10,S}\Sigma_{S,00}^{-1}x_{0,S}-x_{0,S^c}$, of length $|S^c|=p-|S|$. The identity $x_0^{\rm t}\Sigma_n^{-1}x_0=x_{0,S}^

Figures (5)

  • Figure 1.1: FRIC plot for the $2^5=32$ models for estimating the birthweight of the child-to-come, for Mrs. Jones (white, age 40, 60 kg, smoker). The FRIC scores are estimates of the relative risks ${\rm rr}_S={\rm mse}_S/{\rm mse}_{\rm wide}$ of (\ref{['eq:hereisrr']}); the blue circles are the associated point estimates; and the vertical lines are submodel-based 80% confidence intervals. Here 16 submodels have FRIC scores smaller than 1 and are judged to be better than the wide model. See Table \ref{['table:table11']} for identification of the best models and their FRIC scores and estimates.
  • Figure 1.2: Confidence FRIC plots for the $2^5=32$ models for estimating the birthweight of the child-to-come, for Mrs. Jones (white, age 40, 60 kg, smoker). The ${\rm conf}(S)$ values are also p-values for testing ${\rm mse}_S\le{\rm mse}_{\rm wide}$, so submodels to the left in the figure are found not useful for the task of estimating the focus parameter, the mean $\mu={\rm E}\,(Y\,|\, x_0)$. Submodels to the far right are those in which we place trust in their ability to do better than the wide model. See Table \ref{['table:table11']} for identification of the best models and their ${\rm conf}(S)$ scores and estimates.
  • Figure 3.1: Confidence cumulative distribution functions $C_S({\rm rr}_S)$ for the relative risks ${\rm risk}_S/{\rm risk}_{\rm wide}$, as per (\ref{['eq:cdforrr']}). Those with high confidence in values below 1 are better for the focused prediction job for Mrs. Jones's baby than the wide model. The median confidence estimates ${\rm FRIC}_S^{0.50}$ are also read off from the plot.
  • Figure 4.1: Confidence cumulative distribution functions $C^*_S({\rm rr}_S)$ for the relative risks ${\rm risk}_S/{\rm risk}_{\rm wide}$, for the 31 submodels for the mothers-and-babies data, for the case of putting equal importance to all available covariate vectors, i.e. using $v(u)=1/n$ for the $n$ available vectors. Only the submodel corresponding to keeping $x_2,x_3,x_4,x_5$, but excluding $x_1$, with the black full curve, exhibits a clear confidence for working better than the wide model. The ${\rm AFRIC}^{0.50}_S$ scores of (\ref{['eq:africm']}) are read off where the confidence curves cross $0.50$.
  • Figure 6.1: Confidence curve for the root mean squared error of the estimator $\widehat{\mu}_{\rm wide}=x_0^{\rm t}\widehat{\beta}_{\rm wide}$ for the case of Mrs. Jones. The estimate itself is 2.889 kg, and the estimated root-mse of that estimate is 0.180, with the figure displaying the full ${\rm cc}({\rm rmse}_{\rm wide})$.

Theorems & Definitions (4)

  • Lemma 1
  • proof
  • Lemma 2
  • proof