A PLS-Integrated LASSO Method with Application in Index Tracking
Shiqin Tang, Yining Dong, S. Joe Qin
TL;DR
The paper introduces PLS-integrated Lasso (PLS-Lasso), a framework that unifies dimension reduction with regularized regression in a single objective. It presents two formulations, PLS-Lasso-v1 and PLS-Lasso-v2, with algorithms guaranteeing convergence to global optima; v1 uses a covariance-enhancing term with a tunable parameter, while v2 avoids that parameter via a Charnes–Cooper transformation and ADMM. Empirical results on NASDAQ-100 and S&P-500 index tracking show that PLS-Lasso-v2 achieves Pareto-optimal performance with higher sparsity and better generalization than Lasso, while PLS-Lasso-v1 exhibits sensitivity to hyperparameters. The work suggests promising avenues for extending to multivariate responses, multiple latent factors, probabilistic interpretations, and generalized penalties.
Abstract
In traditional multivariate data analysis, dimension reduction and regression have been treated as distinct endeavors. Established techniques such as principal component regression (PCR) and partial least squares (PLS) regression traditionally compute latent components as intermediary steps -- although with different underlying criteria -- before proceeding with the regression analysis. In this paper, we introduce an innovative regression methodology named PLS-integrated Lasso (PLS-Lasso) that integrates the concept of dimension reduction directly into the regression process. We present two distinct formulations for PLS-Lasso, denoted as PLS-Lasso-v1 and PLS-Lasso-v2, along with clear and effective algorithms that ensure convergence to global optima. PLS-Lasso-v1 and PLS-Lasso-v2 are compared with Lasso on the task of financial index tracking and show promising results.
