File Fragment Classification using Light-Weight Convolutional Neural Networks

Mustafa Ghaleb; Kunwar Saaim; Muhamad Felemban; Saleh Al-Saleh; Ahmad Al-Mulhem

File Fragment Classification using Light-Weight Convolutional Neural Networks

Mustafa Ghaleb, Kunwar Saaim, Muhamad Felemban, Saleh Al-Saleh, Ahmad Al-Mulhem

TL;DR

This paper addresses the challenge of identifying file fragment types in digital forensics without relying on metadata. It introduces three light-weight CNN architectures based on depthwise separable convolutions (DSC, DSC-SE, M-DSC) that drastically reduce parameters while maintaining competitive accuracy for file fragment classification. Evaluated on the FFT-75 dataset, the models achieve about 79% accuracy with roughly 100K parameters and around 164 MFLOPs, outperforming FiFTy in inference speed while remaining competitive in accuracy. The results demonstrate the practicality of fast, resource-efficient on-device classification for large-scale forensic analysis, with potential for neural architecture search and distillation as future enhancements.

Abstract

In digital forensics, file fragment classification is an important step toward completing file carving process. There exist several techniques to identify the type of file fragments without relying on meta-data, such as using features like header/footer and N-gram to identify the fragment type. Recently, convolutional neural network (CNN) models have been used to build classification models to achieve this task. However, the number of parameters in CNNs tends to grow exponentially as the number of layers increases. This results in a dramatic increase in training and inference time. In this paper, we propose light-weight file fragment classification models based on depthwise separable CNNs. The evaluation results show that our proposed models provide faster inference time with comparable accuracy as compared to the state-of-art CNN based models. In particular, our models were able to achieve an accuracy of 79\% on the FFT-75 dataset with nearly 100K parameters and 164M FLOPs, which is 4x smaller and 6x faster than the state-of-the-art classifier in the literature.

File Fragment Classification using Light-Weight Convolutional Neural Networks

TL;DR

Abstract

Paper Structure (18 sections, 7 equations, 6 figures, 5 tables)

This paper contains 18 sections, 7 equations, 6 figures, 5 tables.

Introduction
Preliminaries
File carving
Convolution Neural Network
Depthwise Separable Convolution
Light-weight CNNs for File Fragment Classification
Depthwise Separable Convolution (DSC)
Depthwise Separable Convolution with Squeeze and Excitation (DSC-SE)
Modified Depthwise Separable Convolution (M-DSC)
Performance Evaluation
Dataset
Baseline Model
Experimental setup
Results
Discussion
...and 3 more sections

Figures (6)

Figure 1: Standard convolution in (a) is factorized into two layers: Depthwise Convolution in (b) and pointwise in (c).
Figure 2: Network Architectures. a) DSC, b) DSC-SE, and c) M-DSC
Figure 3: Inception Block
Figure 4: Modified Inception Block
Figure 5: Squeeze-and-Excitation Block
...and 1 more figures

File Fragment Classification using Light-Weight Convolutional Neural Networks

TL;DR

Abstract

File Fragment Classification using Light-Weight Convolutional Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (6)