Real-Time Aerial Fire Detection on Resource-Constrained Devices Using Knowledge Distillation
Sabina Jangirova, Branislava Jankovic, Waseem Ullah, Latif U. Khan, Mohsen Guizani
TL;DR
Real-time aerial fire detection on resource-constrained devices is addressed by distilling knowledge from a large transformer teacher (ViT/32) into a compact MobileViT-S student. The proposed KD framework uses a loss $L = (1-\alpha)\mathcal{L}_{CE}(y, y^s) + \alpha T^2 \mathcal{L}_{KLD}(s^t, s^s)$ to transfer global context, with Grad-CAM confirming focus on fire regions. Experiments on BowFire, ADSF, and DFAN show competitive accuracy and the highest FPS on edge hardware, enabling deployment on UAVs and IoT devices. The work advances practical, scalable fire monitoring, while noting cloud and smoke/disambiguation challenges and suggesting temporal-data and richer KD strategies for future work.
Abstract
Wildfire catastrophes cause significant environmental degradation, human losses, and financial damage. To mitigate these severe impacts, early fire detection and warning systems are crucial. Current systems rely primarily on fixed CCTV cameras with a limited field of view, restricting their effectiveness in large outdoor environments. The fusion of intelligent fire detection with remote sensing improves coverage and mobility, enabling monitoring in remote and challenging areas. Existing approaches predominantly utilize convolutional neural networks and vision transformer models. While these architectures provide high accuracy in fire detection, their computational complexity limits real-time performance on edge devices such as UAVs. In our work, we present a lightweight fire detection model based on MobileViT-S, compressed through the distillation of knowledge from a stronger teacher model. The ablation study highlights the impact of a teacher model and the chosen distillation technique on the model's performance improvement. We generate activation map visualizations using Grad-CAM to confirm the model's ability to focus on relevant fire regions. The high accuracy and efficiency of the proposed model make it well-suited for deployment on satellites, UAVs, and IoT devices for effective fire detection. Experiments on common fire benchmarks demonstrate that our model suppresses the state-of-the-art model by 0.44%, 2.00% while maintaining a compact model size. Our model delivers the highest processing speed among existing works, achieving real-time performance on resource-constrained devices.
