Table of Contents
Fetching ...

Elliptical Wishart distributions: information geometry, maximum likelihood estimator, performance analysis and statistical learning

Imen Ayadi, Florent Bouchard, Frédéric Pascal

TL;DR

Two algorithms to compute the maximum likelihood estimator (MLE) are proposed: a fixed point algorithm and a Riemannian optimization method based on the derived information geometry of Elliptical Wishart distributions.

Abstract

This paper deals with Elliptical Wishart distributions - which generalize the Wishart distribution - in the context of signal processing and machine learning. Two algorithms to compute the maximum likelihood estimator (MLE) are proposed: a fixed point algorithm and a Riemannian optimization method based on the derived information geometry of Elliptical Wishart distributions. The existence and uniqueness of the MLE are characterized as well as the convergence of both estimation algorithms. Statistical properties of the MLE are also investigated such as consistency, asymptotic normality and an intrinsic version of Fisher efficiency. On the statistical learning side, novel classification and clustering methods are designed. For the $t$-Wishart distribution, the performance of the MLE and statistical learning algorithms are evaluated on both simulated and real EEG and hyperspectral data, showcasing the interest of our proposed methods.

Elliptical Wishart distributions: information geometry, maximum likelihood estimator, performance analysis and statistical learning

TL;DR

Two algorithms to compute the maximum likelihood estimator (MLE) are proposed: a fixed point algorithm and a Riemannian optimization method based on the derived information geometry of Elliptical Wishart distributions.

Abstract

This paper deals with Elliptical Wishart distributions - which generalize the Wishart distribution - in the context of signal processing and machine learning. Two algorithms to compute the maximum likelihood estimator (MLE) are proposed: a fixed point algorithm and a Riemannian optimization method based on the derived information geometry of Elliptical Wishart distributions. The existence and uniqueness of the MLE are characterized as well as the convergence of both estimation algorithms. Statistical properties of the MLE are also investigated such as consistency, asymptotic normality and an intrinsic version of Fisher efficiency. On the statistical learning side, novel classification and clustering methods are designed. For the -Wishart distribution, the performance of the MLE and statistical learning algorithms are evaluated on both simulated and real EEG and hyperspectral data, showcasing the interest of our proposed methods.

Paper Structure

This paper contains 17 sections, 12 theorems, 36 equations, 2 figures, 3 tables, 4 algorithms.

Key Result

Proposition 1

The Fisher information metric of Elliptical Wishart distributions is, for $\boldsymbol{G}\in\mathcal{S}^{++}_{p}$, $\boldsymbol{\xi}$ and $\boldsymbol{\eta}\in\mathcal{S}_{p}$, where, given $u(t)=-2h'(t)/h(t)$,

Figures (2)

  • Figure 1: Comparison over 20 runs between the fixed-point algorithm presented in Algorithm \ref{['algo:fixed_point']} (FP) and the Riemannian optimization based algorithm provided in Algorithm \ref{['algo:riem']} (RCG). For the Riemannian algorithm, a Riemannian conjugate gradient method is leveraged. The estimation error $\delta^2(\boldsymbol{\widehat{G}},\boldsymbol{G})$ is plotted as a function of (i) the number of iterations and (ii) time. Fixed parameters are $p=10$, $K=300$ and $\nu=10$.
  • Figure 2: Mean and standard deviation of error measure $\delta^2(\boldsymbol{\widehat{G}},\boldsymbol{G})$ as a function of the number of matrices $K$ for the Wishart maximum likelihood estimator \ref{['eq:mle_Wishart']} and the $t$-Wishart maximum likelihood estimator computed with Algorithm \ref{['algo:riem']} (Riemannian conjugate gradient). The intrinsic Cramér-Rao bound \ref{['eq:icrb']} is also displayed. Fixed parameters are $p=10$ and $n=100$. Degrees of freedom for simulated $t$-Wishart random matrices $\{\boldsymbol{S}_k\}_{k=1}^K$ are $\nu=10$ (left) and $\nu=100$ (right). In both cases, the $t$-Wishart maximum likelihood estimator is computed with the correct value of $\nu$. Means and standard deviations are computed over $200$ Monte Carlo repetitions.

Theorems & Definitions (28)

  • Definition 1: Elliptical Wishart distributions teng1989generalized
  • Proposition 1: Fisher information metric
  • proof
  • Remark 1
  • Proposition 2: Euclidean gradient of the negative log-likelihood \ref{['eq:log_lik']}
  • proof
  • Proposition 3: Euclidean gradient to Riemannian gradient bouchard2021riemannian
  • Remark 2
  • Remark 3
  • Proposition 4: Existence and uniqueness of Elliptical Wishart MLE
  • ...and 18 more