Subspace Ordering for Maximum Response Preservation in Sufficient Dimension Reduction
Derik T. Boonstra, Rakheon Kim, Dean M. Young
TL;DR
This work challenges the conventional eigenvalue-based ordering of SDR directions and introduces predictive-importance criteria that directly measure a subspace's relevance to the response. By defining a binary-response T-statistic based measure $T_j$ and a multiclass/continuous-extension via the F-statistic $F_j$ (and population analogues $\Delta_j$, $\Psi_j$), the authors establish consistency results and demonstrate through simulations and real data that ordering by these criteria yields improved prediction and subspace recovery. The approach unifies discriminant analysis perspectives with SDR and shows that traditional variance-focused ordering can miss the most informative directions for prediction. The proposed criteria can turn unsupervised methods into pseudo-supervised ones and are supported by publicly available code, offering a practical, interpretable alternative for subspace selection in SDR tasks.
Abstract
Sufficient dimension reduction (SDR) methods aim to identify a dimension reduction subspace (DRS) that preserves all the information about the conditional distribution of a response given its predictor. Traditional SDR methods determine the DRS by solving a method-specific generalized eigenvalue problem and selecting the eigenvectors corresponding to the largest eigenvalues. In this article, we argue against the long-standing convention of using eigenvalues as the measure of subspace importance and propose alternative ordering criteria that directly assess the predictive relevance of each subspace. For a binary response, we introduce a subspace ordering criterion based on the absolute value of the independent Student's T-statistic. Theoretically, our criterion identifies subspaces that achieve the local minimum Bayes' error rate and yields consistent ordering of directions under mild regularity conditions. Additionally, we employ an F-statistic to provide a framework that unifies categorical and continuous responses under a single subspace criterion. We evaluate our proposed criteria within multiple SDR methods through extensive simulation studies and applications to real data. Our empirical results demonstrate the efficacy of reordering subspaces using our proposed criteria, which generally improves classification accuracy and subspace estimation compared to ordering by eigenvalues.
