Table of Contents
Fetching ...

Self-Supervised Learning for Covariance Estimation

Tzvi Diskin, Ami Wiesel

TL;DR

This work proposes to globally learn a neural network that will then be applied locally at inference time, leveraging recent advancements in self-supervised foundational models, to train the network without any labeling by simply masking different samples and learning to predict their covariance given their surrounding neighbors.

Abstract

We consider the use of deep learning for covariance estimation. We propose to globally learn a neural network that will then be applied locally at inference time. Leveraging recent advancements in self-supervised foundational models, we train the network without any labeling by simply masking different samples and learning to predict their covariance given their surrounding neighbors. The architecture is based on the popular attention mechanism. Its main advantage over classical methods is the automatic exploitation of global characteristics without any distributional assumptions or regularization. It can be pre-trained as a foundation model and then be repurposed for various downstream tasks, e.g., adaptive target detection in radar or hyperspectral imagery.

Self-Supervised Learning for Covariance Estimation

TL;DR

This work proposes to globally learn a neural network that will then be applied locally at inference time, leveraging recent advancements in self-supervised foundational models, to train the network without any labeling by simply masking different samples and learning to predict their covariance given their surrounding neighbors.

Abstract

We consider the use of deep learning for covariance estimation. We propose to globally learn a neural network that will then be applied locally at inference time. Leveraging recent advancements in self-supervised foundational models, we train the network without any labeling by simply masking different samples and learning to predict their covariance given their surrounding neighbors. The architecture is based on the popular attention mechanism. Its main advantage over classical methods is the automatic exploitation of global characteristics without any distributional assumptions or regularization. It can be pre-trained as a foundation model and then be repurposed for various downstream tasks, e.g., adaptive target detection in radar or hyperspectral imagery.
Paper Structure (7 sections, 2 theorems, 22 equations, 4 figures, 1 table)

This paper contains 7 sections, 2 theorems, 22 equations, 4 figures, 1 table.

Key Result

Theorem 1

Let $y$ and $x$ be random variables with a joint distribution $p(x,y)$. Assume $y$ is zero mean and that ${\rm{E}}\left[yy^T|x\right]$ is non-singular for any $x$. Then

Figures (4)

  • Figure 1: Label and feature windows around test sample
  • Figure 2: Architecture of SSCE
  • Figure 3: ROC curve of AMF detector in the synthetic data experiment SSCE beats its competitors and is close to the oracle AMF which uses the true covariance matrices
  • Figure 4: ROC curve of ANMF detector in the IPIX data experiment. SSCE gives better results than its competitors.

Theorems & Definitions (5)

  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Example 1