Table of Contents
Fetching ...

Large multi-response linear regression estimation based on low-rank pre-smoothing

Xinle Tian, Alex Gibberd, Matthew Nunes, Sandipan Roy

Abstract

Pre-smoothing is a technique aimed at increasing the signal-to-noise ratio in data to improve subsequent estimation and model selection in regression problems. However, pre-smoothing has thus far been limited to the univariate response regression setting. However, there are many scientific applications in which interest lies in multi-response regression problems, particularly when the number of responses is large. Motivated by this setting, this article proposes a technique for data pre-smoothing based on low-rank approximation. We establish theoretical results on the performance of the proposed methodology, which show that in this large-response setting, the proposed technique outperforms ordinary least squares estimation with the mean squared error criterion, whilst being computationally more efficient than alternative approaches such as reduced rank regression. We quantify our estimator's benefit empirically in a number of simulated experiments. We also demonstrate our proposed low-rank pre-smoothing technique on real data arising from the environmental and biological sciences.

Large multi-response linear regression estimation based on low-rank pre-smoothing

Abstract

Pre-smoothing is a technique aimed at increasing the signal-to-noise ratio in data to improve subsequent estimation and model selection in regression problems. However, pre-smoothing has thus far been limited to the univariate response regression setting. However, there are many scientific applications in which interest lies in multi-response regression problems, particularly when the number of responses is large. Motivated by this setting, this article proposes a technique for data pre-smoothing based on low-rank approximation. We establish theoretical results on the performance of the proposed methodology, which show that in this large-response setting, the proposed technique outperforms ordinary least squares estimation with the mean squared error criterion, whilst being computationally more efficient than alternative approaches such as reduced rank regression. We quantify our estimator's benefit empirically in a number of simulated experiments. We also demonstrate our proposed low-rank pre-smoothing technique on real data arising from the environmental and biological sciences.

Paper Structure

This paper contains 23 sections, 5 theorems, 46 equations, 12 figures, 7 tables.

Key Result

Theorem 1

Asymptotic distribution of LRPS Under model (eq:mrreg), suppose the moments of the errors $\{e_{1},\ldots,e_{n}\}$ exist (and are finite) up to the fourth order. Assuming that $\lim_{n\rightarrow\infty}S_{X}=\Sigma_{X}$, then we have

Figures (12)

  • Figure 1: Performance of LRPS and RRR in terms of empirical $\mathrm{\hat{MSE}=}m^{-1}\sum_{s=1}^{m}\|Y^{(s)}-X^{(s)}B^{(s)}\|_{F}^{2}$ where $Y^{(s)},X^{(s)}$ and $B^{(s)}$ represent the data, and estimate (LRPS in red, and RRR in black) for simulation $s$, and $m=100$ is the number of simulations.
  • Figure 2: The $\mathrm{bias}^{2}(\tilde{B})$, $\mathrm{Var}(\tilde{B})$, and $\mathrm{MSE}(\tilde{B})$ values extrapolated from the LRPS asymptotic distribution in Theorem \ref{['theorem:lrpsdist']}, compared to the finite-sample (empirical) MSE for both LRPS and RRR. In this simulation, $k_{*}=5$ and other than the sample size, the conditions are the same as those given in the left panel of Figure \ref{['fig:Performance-of-LRPS']}.
  • Figure 3: Distribution of singular values for the signal $\gamma_{j}(B)$ and noise $\gamma_{j}^{1/2}(\Sigma_{e})$, in black and red respectively. The signal-to-noise ratio $\mathrm{SNR}:=\|XB\|_{F}/\|\Sigma_{e}\|_{F}$ is given next to each line, and the lines represent the average value of the spectrum over $m=100$ simulations.
  • Figure 4: Comparison of LRPS and RRR performance as a function of $k$ in the case where $p=10$, $q=100$, $n=100$. Black lines denote the performance of RRR and red those of LRPS. In the middle and right plots, we fix $\sigma^{2}=1$ and vary the eigenvalues of the signal and noise respectively by adjusting $\lambda$ and $\rho$. In $B$ Condition 3, we set all the singular values of $B$ to be one, and maintain that $r(B)=p$ indicated by the dashed line, in the left plot, we set the sparsity to be $s=5$.
  • Figure 5: Comparison of LRPS and RRR performance as a function of $k$ in the case where $q=10$, $n=100$ and $p\in\{20,40,60\}$. The covariance of errors is given by $\Sigma$-Condition 1 with $\sigma^{2}\in\{0.5,1,2\}$, the true rank of $B$ is denoted by the vertical grey line.
  • ...and 7 more figures

Theorems & Definitions (8)

  • Definition 1
  • Theorem 1
  • Definition 2
  • Proposition 1
  • Lemma 1
  • proof
  • Theorem
  • Corollary