Table of Contents
Fetching ...

Empowering Malware Detection Efficiency within Processing-in-Memory Architecture

Sreenitha Kasarapu, Sathwika Bavikadi, Sai Manoj Pudukotai Dinakarrao

TL;DR

This work tackles the resource-intensive problem of updating malware-detection models on embedded devices by leveraging a processing-in-memory (PIM) architecture with LUT-based cores and precision scaling. By processing data in-memory and quantizing inputs to lower bit-widths, the approach achieves high throughput and energy efficiency while maintaining competitive detection accuracy. The method maps CNN computations to LUT-core operations, enabling CNN acceleration directly inside DRAM clusters, and demonstrates strong performance gains over CPU, GPU, and prior LUT-PIM baselines. The results indicate that precision scaling combined with PIM can enable sustainable, real-time malware detection for edge and IoT environments, facilitating frequent model updates with reduced compute resources.

Abstract

The widespread integration of embedded systems across various industries has facilitated seamless connectivity among devices and bolstered computational capabilities. Despite their extensive applications, embedded systems encounter significant security threats, with one of the most critical vulnerabilities being malicious software, commonly known as malware. In recent times, malware detection techniques leveraging Machine Learning have gained popularity. Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs) have proven particularly efficient in image processing tasks. However, one major drawback of neural network architectures is their substantial computational resource requirements. Continuous training of malware detection models with updated malware and benign samples demands immense computational resources, presenting a challenge for real-world applications. In response to these concerns, we propose a Processing-in-Memory (PIM)-based architecture to mitigate memory access latency, thereby reducing the resources consumed during model updates. To further enhance throughput and minimize energy consumption, we incorporate precision scaling techniques tailored for CNN models. Our proposed PIM architecture exhibits a 1.09x higher throughput compared to existing Lookup Table (LUT)-based PIM architectures. Additionally, precision scaling combined with PIM enhances energy efficiency by 1.5x compared to full-precision operations, without sacrificing performance. This innovative approach offers a promising solution to the resource-intensive nature of malware detection model updates, paving the way for more efficient and sustainable cybersecurity practices.

Empowering Malware Detection Efficiency within Processing-in-Memory Architecture

TL;DR

This work tackles the resource-intensive problem of updating malware-detection models on embedded devices by leveraging a processing-in-memory (PIM) architecture with LUT-based cores and precision scaling. By processing data in-memory and quantizing inputs to lower bit-widths, the approach achieves high throughput and energy efficiency while maintaining competitive detection accuracy. The method maps CNN computations to LUT-core operations, enabling CNN acceleration directly inside DRAM clusters, and demonstrates strong performance gains over CPU, GPU, and prior LUT-PIM baselines. The results indicate that precision scaling combined with PIM can enable sustainable, real-time malware detection for edge and IoT environments, facilitating frequent model updates with reduced compute resources.

Abstract

The widespread integration of embedded systems across various industries has facilitated seamless connectivity among devices and bolstered computational capabilities. Despite their extensive applications, embedded systems encounter significant security threats, with one of the most critical vulnerabilities being malicious software, commonly known as malware. In recent times, malware detection techniques leveraging Machine Learning have gained popularity. Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs) have proven particularly efficient in image processing tasks. However, one major drawback of neural network architectures is their substantial computational resource requirements. Continuous training of malware detection models with updated malware and benign samples demands immense computational resources, presenting a challenge for real-world applications. In response to these concerns, we propose a Processing-in-Memory (PIM)-based architecture to mitigate memory access latency, thereby reducing the resources consumed during model updates. To further enhance throughput and minimize energy consumption, we incorporate precision scaling techniques tailored for CNN models. Our proposed PIM architecture exhibits a 1.09x higher throughput compared to existing Lookup Table (LUT)-based PIM architectures. Additionally, precision scaling combined with PIM enhances energy efficiency by 1.5x compared to full-precision operations, without sacrificing performance. This innovative approach offers a promising solution to the resource-intensive nature of malware detection model updates, paving the way for more efficient and sustainable cybersecurity practices.
Paper Structure (21 sections, 5 equations, 5 figures, 2 tables)

This paper contains 21 sections, 5 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Hierarchical view of the architecture implementation of malware detection on the processing in-memory architecture
  • Figure 2: Performance Evaluation of AlexNet, ResNet18, ResNet34, ResNet50, VGG16 and MobileNetV2 on the PIM accelerator with precision scaling (a) 32-bit floating point, (b) 16-bit integer type, (c) 8-bit integer type and (d) 4-bit integer type
  • Figure 3: Comparison of Energy efficiency (Frames/Joules) for AlexNet, ResNet18, ResNet34, ResNet50, VGG16, and MobileNetV2 on the PIM accelerator
  • Figure 4: Comparison of Throughput (Frames/second) for AlexNet, ResNet18, ResNet34, ResNet50, VGG16, and MobileNetV2 on the PIM accelerator
  • Figure 5: Comparative performance analysis of PIM with respect to state-of-the-art hardware accelerator architectures in terms of throughput (Frames/second) and power consumption (Watt)