Table of Contents
Fetching ...

Audio-Based Classification of Insect Species Using Machine Learning Models: Cicada, Beetle, Termite, and Cricket

Manas V Shetty, Yoga Disha Sendhil Kumar

TL;DR

This work addresses audio based classification of four insect species using MFCC features and traditional ML models. It investigates 40 MFCC based features, multiple classifiers including DT, RF, and KNN, and explores feature selection and data augmentation effects. Key findings show near perfect accuracy in initial evaluations and strong performance of KNN across several conditions, while augmentation introduces class overlap that can reduce accuracy. The results demonstrate the potential for scalable acoustic monitoring for ecological and pest management applications while outlining limitations and future directions for more robust deployment.

Abstract

This project addresses the challenge of classifying insect species: Cicada, Beetle, Termite, and Cricket using sound recordings. Accurate species identification is crucial for ecological monitoring and pest management. We employ machine learning models such as XGBoost, Random Forest, and K Nearest Neighbors (KNN) to analyze audio features, including Mel Frequency Cepstral Coefficients (MFCC). The potential novelty of this work lies in the combination of diverse audio features and machine learning models to tackle insect classification, specifically focusing on capturing subtle acoustic variations between species that have not been fully leveraged in previous research. The dataset is compiled from various open sources, and we anticipate achieving high classification accuracy, contributing to improved automated insect detection systems.

Audio-Based Classification of Insect Species Using Machine Learning Models: Cicada, Beetle, Termite, and Cricket

TL;DR

This work addresses audio based classification of four insect species using MFCC features and traditional ML models. It investigates 40 MFCC based features, multiple classifiers including DT, RF, and KNN, and explores feature selection and data augmentation effects. Key findings show near perfect accuracy in initial evaluations and strong performance of KNN across several conditions, while augmentation introduces class overlap that can reduce accuracy. The results demonstrate the potential for scalable acoustic monitoring for ecological and pest management applications while outlining limitations and future directions for more robust deployment.

Abstract

This project addresses the challenge of classifying insect species: Cicada, Beetle, Termite, and Cricket using sound recordings. Accurate species identification is crucial for ecological monitoring and pest management. We employ machine learning models such as XGBoost, Random Forest, and K Nearest Neighbors (KNN) to analyze audio features, including Mel Frequency Cepstral Coefficients (MFCC). The potential novelty of this work lies in the combination of diverse audio features and machine learning models to tackle insect classification, specifically focusing on capturing subtle acoustic variations between species that have not been fully leveraged in previous research. The dataset is compiled from various open sources, and we anticipate achieving high classification accuracy, contributing to improved automated insect detection systems.

Paper Structure

This paper contains 31 sections, 31 figures, 1 table.

Figures (31)

  • Figure 1: (a) Cricket, (b) Cicada, (c) Termite, and (d) Bark Beetle.
  • Figure 2: Segment Lengths by Class
  • Figure 3: Number of Initial Instances Per Class
  • Figure 4: First Few Rows of the MFCC DataFrames
  • Figure 5: Decision Tree Results
  • ...and 26 more figures