Table of Contents
Fetching ...

AI-Powered Early Detection of Critical Diseases using Image Processing and Audio Analysis

Manisha More, Kavya Bhand, Kaustubh Mukdam, Kavya Sharma, Manas Kawtikwar, Hridayansh Kaware, Prajwal Kavhar

TL;DR

An AI-driven multimodal framework is proposed to address early detection of skin cancer, vascular clots, and cardiopulmonary abnormalities using image, thermal, and audio data. It deploys three specialized pipelines: MobileNetV2-based skin lesion classification on ISIC 2019, SVM-based thermal clot detection, and MFCC+Random Forest analysis for heart and lung sounds. The system achieves accuracies in the high 80s and AUC near 0.89–0.92 across tasks and runs in under 2 seconds on low-cost hardware, supporting real-time, edge deployment. This work demonstrates the feasibility of integrated, lightweight pre-diagnostic tools to improve access in resource-limited settings.

Abstract

Early diagnosis of critical diseases can significantly improve patient survival and reduce treatment costs. However, existing diagnostic techniques are often costly, invasive, and inaccessible in low-resource regions. This paper presents a multimodal artificial intelligence (AI) diagnostic framework integrating image analysis, thermal imaging, and audio signal processing for early detection of three major health conditions: skin cancer, vascular blood clots, and cardiopulmonary abnormalities. A fine-tuned MobileNetV2 convolutional neural network was trained on the ISIC 2019 dataset for skin lesion classification, achieving 89.3% accuracy, 91.6% sensitivity, and 88.2% specificity. A support vector machine (SVM) with handcrafted features was employed for thermal clot detection, achieving 86.4% accuracy (AUC = 0.89) on synthetic and clinical data. For cardiopulmonary analysis, lung and heart sound datasets from PhysioNet and Pascal were processed using Mel-Frequency Cepstral Coefficients (MFCC) and classified via Random Forest, reaching 87.2% accuracy and 85.7% sensitivity. Comparative evaluation against state-of-the-art models demonstrates that the proposed system achieves competitive results while remaining lightweight and deployable on low-cost devices. The framework provides a promising step toward scalable, real-time, and accessible AI-based pre-diagnostic healthcare solutions.

AI-Powered Early Detection of Critical Diseases using Image Processing and Audio Analysis

TL;DR

An AI-driven multimodal framework is proposed to address early detection of skin cancer, vascular clots, and cardiopulmonary abnormalities using image, thermal, and audio data. It deploys three specialized pipelines: MobileNetV2-based skin lesion classification on ISIC 2019, SVM-based thermal clot detection, and MFCC+Random Forest analysis for heart and lung sounds. The system achieves accuracies in the high 80s and AUC near 0.89–0.92 across tasks and runs in under 2 seconds on low-cost hardware, supporting real-time, edge deployment. This work demonstrates the feasibility of integrated, lightweight pre-diagnostic tools to improve access in resource-limited settings.

Abstract

Early diagnosis of critical diseases can significantly improve patient survival and reduce treatment costs. However, existing diagnostic techniques are often costly, invasive, and inaccessible in low-resource regions. This paper presents a multimodal artificial intelligence (AI) diagnostic framework integrating image analysis, thermal imaging, and audio signal processing for early detection of three major health conditions: skin cancer, vascular blood clots, and cardiopulmonary abnormalities. A fine-tuned MobileNetV2 convolutional neural network was trained on the ISIC 2019 dataset for skin lesion classification, achieving 89.3% accuracy, 91.6% sensitivity, and 88.2% specificity. A support vector machine (SVM) with handcrafted features was employed for thermal clot detection, achieving 86.4% accuracy (AUC = 0.89) on synthetic and clinical data. For cardiopulmonary analysis, lung and heart sound datasets from PhysioNet and Pascal were processed using Mel-Frequency Cepstral Coefficients (MFCC) and classified via Random Forest, reaching 87.2% accuracy and 85.7% sensitivity. Comparative evaluation against state-of-the-art models demonstrates that the proposed system achieves competitive results while remaining lightweight and deployable on low-cost devices. The framework provides a promising step toward scalable, real-time, and accessible AI-based pre-diagnostic healthcare solutions.

Paper Structure

This paper contains 22 sections, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Flowchart of the model
  • Figure 2: Skin Cancer Detection Module
  • Figure 3: Blood Clot Detection Module