Table of Contents
Fetching ...

Application of Tensorized Neural Networks for Cloud Classification

Alifu Xiafukaiti, Devanshu Garg, Aruto Hosaka, Koichi Yanagisawa, Yuichiro Minato, Tsuyoshi Yoshida

TL;DR

The paper tackles the computational and memory bottlenecks of CNNs in cloud classification by introducing tensorized neural networks (TNNs) that factorize dense layers, integrate attention, and are trained with contrastive self-supervised learning to classify cloud imagery. The approach achieves dramatic parameter reductions (up to 95.4%) and notable speedups (up to 22.4%) while evaluating on the CCSN dataset across 11 cloud categories. It provides insights into how batch size relative to GPU SM counts influences accuracy for both tensorized and conventional models, highlighting practical regimes where TNNs offer superior performance. The work demonstrates the potential of TNNs for real-time meteorological inference, offering guidance on tensorization design and training strategies for resource-constrained cloud classification tasks.

Abstract

Convolutional neural networks (CNNs) have gained widespread usage across various fields such as weather forecasting, computer vision, autonomous driving, and medical image analysis due to its exceptional ability to extract spatial information, share parameters, and learn local features. However, the practical implementation and commercialization of CNNs in these domains are hindered by challenges related to model sizes, overfitting, and computational time. To address these limitations, our study proposes a groundbreaking approach that involves tensorizing the dense layers in the CNN to reduce model size and computational time. Additionally, we incorporate attention layers into the CNN and train it using Contrastive self-supervised learning to effectively classify cloud information, which is crucial for accurate weather forecasting. We elucidate the key characteristics of tensorized neural network (TNN), including the data compression rate, accuracy, and computational speed. The results indicate how TNN change their properties under the batch size setting.

Application of Tensorized Neural Networks for Cloud Classification

TL;DR

The paper tackles the computational and memory bottlenecks of CNNs in cloud classification by introducing tensorized neural networks (TNNs) that factorize dense layers, integrate attention, and are trained with contrastive self-supervised learning to classify cloud imagery. The approach achieves dramatic parameter reductions (up to 95.4%) and notable speedups (up to 22.4%) while evaluating on the CCSN dataset across 11 cloud categories. It provides insights into how batch size relative to GPU SM counts influences accuracy for both tensorized and conventional models, highlighting practical regimes where TNNs offer superior performance. The work demonstrates the potential of TNNs for real-time meteorological inference, offering guidance on tensorization design and training strategies for resource-constrained cloud classification tasks.

Abstract

Convolutional neural networks (CNNs) have gained widespread usage across various fields such as weather forecasting, computer vision, autonomous driving, and medical image analysis due to its exceptional ability to extract spatial information, share parameters, and learn local features. However, the practical implementation and commercialization of CNNs in these domains are hindered by challenges related to model sizes, overfitting, and computational time. To address these limitations, our study proposes a groundbreaking approach that involves tensorizing the dense layers in the CNN to reduce model size and computational time. Additionally, we incorporate attention layers into the CNN and train it using Contrastive self-supervised learning to effectively classify cloud information, which is crucial for accurate weather forecasting. We elucidate the key characteristics of tensorized neural network (TNN), including the data compression rate, accuracy, and computational speed. The results indicate how TNN change their properties under the batch size setting.
Paper Structure (1 section, 1 equation, 7 figures, 2 tables)

This paper contains 1 section, 1 equation, 7 figures, 2 tables.

Table of Contents

  1. Methodology

Figures (7)

  • Figure 1: Representative sample images of 11-cloud categories from CSSN database.
  • Figure 2: The pipeline for CSSL involves the following steps: Each image generates two augmentations, which are then fed into the base model for feature representation. These representations are subsequently passed to the projection head, which produces the latent representation of the images. The objective is to minimize the distance between these representations for the same image. Once training is complete, the projection head is removed, and the output from the base model is utilized for downstream tasks. ref-simclr
  • Figure 3: The tensor decomposition scheme is as follows: (a) The input X is obtained from the base model after flattening. W represents the weight matrix of the first dense layer. The shared index denoted by a line indicates contraction, resulting in the output Y. This output serves as the input for the subsequent dense layer. (b) The input tensor X is reshaped into a rank-2 tensor. The weight matrix is now represented by two smaller rank-3 weight matrices. This approach maintains the same output dimensionality while reducing the number of parameters required to represent the original weight tensor.
  • Figure 4: The Top-1 accuracy observed using a node of 8 × A100.
  • Figure 5: The Top-1 accuracy observed using a node of 4 × V100.
  • ...and 2 more figures