CoVid-19 Detection leveraging Vision Transformers and Explainable AI
Pangoth Santhosh Kumar, Kundrapu Supriya, Mallikharjuna Rao K, Taraka Satya Krishna Teja Malisetti
TL;DR
This work tackles the problem of early, accurate detection of lung diseases, including COVID-19, from chest X-ray and CT images. It proposes an end-to-end pipeline that combines CLAHE preprocessing, Ben Graham-based color normalization, and data augmentation with a Compact Convolution Transformer (CCT) to classify radiographs. The approach demonstrates strong performance on the COVID-19 Radiography Database, achieving up to 97% training and 94.6% validation accuracy, and employs Grad-CAM to provide explainable, location-specific visualizations of model decisions. The combination of ROI-focused preprocessing, transformer-based classification, and explainability has practical implications for rapid, interpretable lung-disease diagnostics in varied clinical settings.
Abstract
Lung disease is a common health problem in many parts of the world. It is a significant risk to people health and quality of life all across the globe since it is responsible for five of the top thirty leading causes of death. Among them are COVID 19, pneumonia, and tuberculosis, to name just a few. It is critical to diagnose lung diseases in their early stages. Several different models including machine learning and image processing have been developed for this purpose. The earlier a condition is diagnosed, the better the patient chances of making a full recovery and surviving into the long term. Thanks to deep learning algorithms, there is significant promise for the autonomous, rapid, and accurate identification of lung diseases based on medical imaging. Several different deep learning strategies, including convolutional neural networks (CNN), vanilla neural networks, visual geometry group based networks (VGG), and capsule networks , are used for the goal of making lung disease forecasts. The standard CNN has a poor performance when dealing with rotated, tilted, or other aberrant picture orientations. As a result of this, within the scope of this study, we have suggested a vision transformer based approach end to end framework for the diagnosis of lung disorders. In the architecture, data augmentation, training of the suggested models, and evaluation of the models are all included. For the purpose of detecting lung diseases such as pneumonia, Covid 19, lung opacity, and others, a specialised Compact Convolution Transformers (CCT) model have been tested and evaluated on datasets such as the Covid 19 Radiography Database. The model has achieved a better accuracy for both its training and validation purposes on the Covid 19 Radiography Database.
