A subspace method for large-scale trace ratio problems

G. Ferrandi; M. E. Hochstenbach; M. R. Oliveira

A subspace method for large-scale trace ratio problems

G. Ferrandi, M. E. Hochstenbach, M. R. Oliveira

TL;DR

This paper develops a matrix-free, Davidson-type subspace method for large-scale trace ratio problems $\rho(V)=\frac{\mathrm{tr}(V^TAV)}{\mathrm{tr}(V^TBV)}$, enabling efficient solutions in sparse or high-dimensional settings. The method cycles through extraction, expansion, and restart phases to solve projected TR problems within a search subspace, enriching it with residual information while guaranteeing a nondecreasing $\rho$. The authors establish convergence and perturbation-bounded guarantees for the approximate solution as the subspace approaches the true solution, and apply the approach to Fisher's discriminant analysis (FDA) and multigroup classification, including regularization for large-scale problems. Numerical experiments on synthetic and real datasets (e.g., Fashion-MNIST and German Traffic Sign) demonstrate favorable efficiency and competitive accuracy, highlighting the method’s practical impact in scalable dimensionality reduction and classification tasks.

Abstract

A subspace method is introduced to solve large-scale trace ratio problems. This approach is matrix-free, requiring only the action of the two matrices involved in the trace ratio. At each iteration, a smaller trace ratio problem is addressed in the search subspace. Additionally, the algorithm is endowed with a restarting strategy, that ensures the monotonicity of the trace ratio value throughout the iterations. The behavior of the approximate solution is investigated from a theoretical viewpoint, extending existing results on Ritz values and vectors, as the angle between the search subspace and the exact solution approaches zero. Numerical experiments in multigroup classification show that this new subspace method tends to be more efficient than iterative approaches relying on (partial) eigenvalue decompositions at each step.

A subspace method for large-scale trace ratio problems

TL;DR

This paper develops a matrix-free, Davidson-type subspace method for large-scale trace ratio problems

, enabling efficient solutions in sparse or high-dimensional settings. The method cycles through extraction, expansion, and restart phases to solve projected TR problems within a search subspace, enriching it with residual information while guaranteeing a nondecreasing

. The authors establish convergence and perturbation-bounded guarantees for the approximate solution as the subspace approaches the true solution, and apply the approach to Fisher's discriminant analysis (FDA) and multigroup classification, including regularization for large-scale problems. Numerical experiments on synthetic and real datasets (e.g., Fashion-MNIST and German Traffic Sign) demonstrate favorable efficiency and competitive accuracy, highlighting the method’s practical impact in scalable dimensionality reduction and classification tasks.

Abstract

Paper Structure (21 sections, 8 theorems, 45 equations, 4 figures, 3 tables, 3 algorithms)

This paper contains 21 sections, 8 theorems, 45 equations, 4 figures, 3 tables, 3 algorithms.

Introduction
Overview of the trace ratio problem
A Davidson type method for trace ratio problems
Subspace extraction
Subspace expansion
Restart
Algorithm
Analysis
A bound for the approximate eigenvalues
A bound for the approximate solution
A bound for the approximate solution in terms of the residual matrix
Classification task
A Davidson type method for FDA
Regularization in the large-scale setting
Classification rule
...and 6 more sections

Key Result

Proposition 1

Let $A$ be symmetric and $B$ be symmetric positive semidefinite, with ${\rm rank}(B) \ge p-k+1$. Then eq:TR.MaxProblem admits a finite maximum.

Figures (4)

Figure 1: Numerical and theoretical quantities to check the convergence of the approximate solution to TR to the exact solution.
Figure 2: On the left: $(\rho^\star - \rho)\beta^\ast$ is bounded by the squared norm of the residual matrix. On the right: the approximate eigenvalues $\lambda_i(H-\rho K)$ converge to the $k$ largest eigenvalues of $A-\rho B$ throughout the iterations.
Figure 3: FDA and TR for Fashion MNIST, solved by various methods. The number of matrix-vector (MV) products is plotted against the spectral norm of the residual matrix, $\|R\|$. The graphs are for different levels of regularization in $S_W$.
Figure 4: Trace ratio value $\rho$ per number of matrix-vector (MV) products. The graphs are for different levels of regularization in $S_W$.

Theorems & Definitions (17)

Proposition 1
Proposition 2
Example 3
Proposition 4
proof
Proposition 5
proof
Proposition 6
proof
Corollary 7
...and 7 more

A subspace method for large-scale trace ratio problems

TL;DR

Abstract

A subspace method for large-scale trace ratio problems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (17)