Audio-Based Classification of Insect Species Using Machine Learning Models: Cicada, Beetle, Termite, and Cricket
Manas V Shetty, Yoga Disha Sendhil Kumar
TL;DR
This work addresses audio based classification of four insect species using MFCC features and traditional ML models. It investigates 40 MFCC based features, multiple classifiers including DT, RF, and KNN, and explores feature selection and data augmentation effects. Key findings show near perfect accuracy in initial evaluations and strong performance of KNN across several conditions, while augmentation introduces class overlap that can reduce accuracy. The results demonstrate the potential for scalable acoustic monitoring for ecological and pest management applications while outlining limitations and future directions for more robust deployment.
Abstract
This project addresses the challenge of classifying insect species: Cicada, Beetle, Termite, and Cricket using sound recordings. Accurate species identification is crucial for ecological monitoring and pest management. We employ machine learning models such as XGBoost, Random Forest, and K Nearest Neighbors (KNN) to analyze audio features, including Mel Frequency Cepstral Coefficients (MFCC). The potential novelty of this work lies in the combination of diverse audio features and machine learning models to tackle insect classification, specifically focusing on capturing subtle acoustic variations between species that have not been fully leveraged in previous research. The dataset is compiled from various open sources, and we anticipate achieving high classification accuracy, contributing to improved automated insect detection systems.
