Efficient Data-Driven Leverage Score Sampling Algorithm for the Minimum Volume Covering Ellipsoid Problem in Big Data

Elizabeth Harris; Ali Eshragh; Bishnu Lamichhane; Jordan Shaw-Carmody; Elizabeth Stojanovski

Efficient Data-Driven Leverage Score Sampling Algorithm for the Minimum Volume Covering Ellipsoid Problem in Big Data

Elizabeth Harris, Ali Eshragh, Bishnu Lamichhane, Jordan Shaw-Carmody, Elizabeth Stojanovski

Abstract

The Minimum Volume Covering Ellipsoid (MVCE) problem, characterised by $n$ observations in $d$ dimensions where $n \gg d$, can be computationally very expensive in the big data regime. We apply methods from randomised numerical linear algebra to develop a data-driven leverage score sampling algorithm for solving MVCE, and establish theoretical error bounds and a convergence guarantee. Assuming the leverage scores follow a power law decay, we show that the computational complexity of computing the approximation for MVCE is reduced from $\mathcal{O}(nd^2)$ to $\mathcal{O}(nd + \text{poly}(d))$, which is a significant improvement in big data problems. Numerical experiments demonstrate the efficacy of our new algorithm, showing that it substantially reduces computation time and yields near-optimal solutions.

Efficient Data-Driven Leverage Score Sampling Algorithm for the Minimum Volume Covering Ellipsoid Problem in Big Data

Abstract

The Minimum Volume Covering Ellipsoid (MVCE) problem, characterised by

observations in

dimensions where

, can be computationally very expensive in the big data regime. We apply methods from randomised numerical linear algebra to develop a data-driven leverage score sampling algorithm for solving MVCE, and establish theoretical error bounds and a convergence guarantee. Assuming the leverage scores follow a power law decay, we show that the computational complexity of computing the approximation for MVCE is reduced from

, which is a significant improvement in big data problems. Numerical experiments demonstrate the efficacy of our new algorithm, showing that it substantially reduces computation time and yields near-optimal solutions.

Efficient Data-Driven Leverage Score Sampling Algorithm for the Minimum Volume Covering Ellipsoid Problem in Big Data

Abstract

Efficient Data-Driven Leverage Score Sampling Algorithm for the Minimum Volume Covering Ellipsoid Problem in Big Data

Abstract

Paper Structure

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (24)