Table of Contents
Fetching ...

FisherMask: Enhancing Neural Network Labeling Efficiency in Image Classification Using Fisher Information

Shreen Gul, Mohamed Elmahallawy, Sanjay Madria, Ardhendu Tripathy

TL;DR

FisherMask is proposed, a Fisher information-based active learning (AL) approach that identifies key network parameters by masking them based on their Fisher information values, serving as an effective tool to measure the sensitivity of model parameters to data samples.

Abstract

Deep learning (DL) models are popular across various domains due to their remarkable performance and efficiency. However, their effectiveness relies heavily on large amounts of labeled data, which are often time-consuming and labor-intensive to generate manually. To overcome this challenge, it is essential to develop strategies that reduce reliance on extensive labeled data while preserving model performance. In this paper, we propose FisherMask, a Fisher information-based active learning (AL) approach that identifies key network parameters by masking them based on their Fisher information values. FisherMask enhances batch AL by using Fisher information to select the most critical parameters, allowing the identification of the most impactful samples during AL training. Moreover, Fisher information possesses favorable statistical properties, offering valuable insights into model behavior and providing a better understanding of the performance characteristics within the AL pipeline. Our extensive experiments demonstrate that FisherMask significantly outperforms state-of-the-art methods on diverse datasets, including CIFAR-10 and FashionMNIST, especially under imbalanced settings. These improvements lead to substantial gains in labeling efficiency. Hence serving as an effective tool to measure the sensitivity of model parameters to data samples. Our code is available on \url{https://github.com/sgchr273/FisherMask}.

FisherMask: Enhancing Neural Network Labeling Efficiency in Image Classification Using Fisher Information

TL;DR

FisherMask is proposed, a Fisher information-based active learning (AL) approach that identifies key network parameters by masking them based on their Fisher information values, serving as an effective tool to measure the sensitivity of model parameters to data samples.

Abstract

Deep learning (DL) models are popular across various domains due to their remarkable performance and efficiency. However, their effectiveness relies heavily on large amounts of labeled data, which are often time-consuming and labor-intensive to generate manually. To overcome this challenge, it is essential to develop strategies that reduce reliance on extensive labeled data while preserving model performance. In this paper, we propose FisherMask, a Fisher information-based active learning (AL) approach that identifies key network parameters by masking them based on their Fisher information values. FisherMask enhances batch AL by using Fisher information to select the most critical parameters, allowing the identification of the most impactful samples during AL training. Moreover, Fisher information possesses favorable statistical properties, offering valuable insights into model behavior and providing a better understanding of the performance characteristics within the AL pipeline. Our extensive experiments demonstrate that FisherMask significantly outperforms state-of-the-art methods on diverse datasets, including CIFAR-10 and FashionMNIST, especially under imbalanced settings. These improvements lead to substantial gains in labeling efficiency. Hence serving as an effective tool to measure the sensitivity of model parameters to data samples. Our code is available on \url{https://github.com/sgchr273/FisherMask}.

Paper Structure

This paper contains 12 sections, 11 equations, 5 figures, 1 algorithm.

Figures (5)

  • Figure 1: Illustration of important weights sampling. Hollow circles represent the set of unlabeled samples $\mathcal{S}$ fed into the neural network. Colored arrows depict the process of identifying important weights while pruning the remaining ones. Based on the selected weights, a subset of unlabeled instances $\mathcal{C}$ (colored circles) is chosen for labeling. This subset is then sent to an oracle for labeling, after which the model will be trained on this newly labeled data, shown in the lower-left portion of the figure, completing one AL round.
  • Figure 2: Profile of important weights across Resnet-18
  • Figure 3: Overview of FisherMask's framework.
  • Figure 4: Data regime for imbalanced CIFAR10.
  • Figure 5: Result for Imbalanced datasets.