FlameFinder: Illuminating Obscured Fire through Smoke with Attentive Deep Metric Learning
Hossein Rajoli, Sahand Khoshdel, Fatemeh Afghah, Xiaolong Ma
TL;DR
FlameFinder tackles smoke-obscured flame detection using UAV-mounted thermal imagery by learning a discriminative latent space through deep metric learning on paired RGB-thermal data. It optimizes a multi-task loss that combines $L_{BCE}$, $L_{rec}$, and $L_{DML}$ (comprising $L_{TL}$, $L_{cosL}$, and $L_{CL}$) and employs an attention mechanism to balance loss contributions, improving feature discrimination. An unobscured RGB-based annotation strategy provides indirect supervision for the thermal domain, while the Local Features Extraction (LFE) module and DML-guided inference enable robust detection in smoky patches. On FLAME2 and FLAME3, FlameFinder achieves higher unobscured flame accuracy and better class separation in obscured scenarios compared with baselines like VGG19 and ResNet18, with improvements of 4.4% (FLAME2) and 7% (FLAME3) in unobscured detection. This work advances real-time wildfire monitoring by enabling reliable flame localization from smoke-affected thermal imagery.
Abstract
FlameFinder is a deep metric learning (DML) framework designed to accurately detect flames, even when obscured by smoke, using thermal images from firefighter drones during wildfire monitoring. Traditional RGB cameras struggle in such conditions, but thermal cameras can capture smoke-obscured flame features. However, they lack absolute thermal reference points, leading to false positives.To address this issue, FlameFinder utilizes paired thermal-RGB images for training. By learning latent flame features from smoke-free samples, the model becomes less biased towards relative thermal gradients. In testing, it identifies flames in smoky patches by analyzing their equivalent thermal-domain distribution. This method improves performance using both supervised and distance-based clustering metrics.The framework incorporates a flame segmentation method and a DML-aided detection framework. This includes utilizing center loss (CL), triplet center loss (TCL), and triplet cosine center loss (TCCL) to identify optimal cluster representatives for classification. However, the dominance of center loss over the other losses leads to the model missing features sensitive to them. To address this limitation, an attention mechanism is proposed. This mechanism allows for non-uniform feature contribution, amplifying the critical role of cosine and triplet loss in the DML framework. Additionally, it improves interpretability, class discrimination, and decreases intra-class variance. As a result, the proposed model surpasses the baseline by 4.4% in the FLAME2 dataset and 7% in the FLAME3 dataset for unobscured flame detection accuracy. Moreover, it demonstrates enhanced class separation in obscured scenarios compared to VGG19, ResNet18, and three backbone models tailored for flame detection.
