Table of Contents
Fetching ...

From Occurrence to Consequence: A Comprehensive Data-driven Analysis of Building Fire Risk

Chenzhi Ma, Hongru Du, Shengzhi Luan, Ensheng Dong, Lauren M. Gardner, Thomas Gernay

TL;DR

This study tackles the persistent risk of building fires in the United States by integrating NFIRS incident reports with social determinants, building inventories, and weather data to examine both fire occurrence and consequences. It deploys generalized additive models to map county-level occurrence and a CatBoost-based FireCat model to predict three consequence outcomes (fire spread, injuries, and economic loss) at fine spatial scales, with SHAP used for explanation. Key findings show that local vulnerabilities and incident-specific factors jointly shape risk: vulnerable communities experience higher fire occurrence and outcomes, while incident attributes (origin, ignition, heat source) predominantly drive consequences; detectors and AES substantially reduce spread and injuries, supporting targeted safety mandates and subsidies. The framework emphasizes local-context risk assessment for equitable fire prevention, identifies data-standardization needs, and points to future work in causal inference and natural language processing to further enhance predictive power and policy relevance.

Abstract

Building fires pose a persistent threat to life, property, and infrastructure, emphasizing the need for advanced risk mitigation strategies. This study presents a data-driven framework analyzing U.S. fire risks by integrating over one million fire incident reports with diverse fire-relevant datasets, including social determinants, building inventories, weather conditions, and incident-specific factors. By adapting machine learning models, we identify key risk factors influencing fire occurrence and consequences. Our findings show that vulnerable communities, characterized by socioeconomic disparities or the prevalence of outdated or vacant buildings, face higher fire risks. Incident-specific factors, such as fire origins and safety features, strongly influence fire consequences. Buildings equipped with fire detectors and automatic extinguishing systems experience significantly lower fire spread and injury risks. By pinpointing high-risk areas and populations, this research supports targeted interventions, including mandating fire safety systems and providing subsidies for disadvantaged communities. These measures can enhance fire prevention, protect vulnerable groups, and promote safer, more equitable communities.

From Occurrence to Consequence: A Comprehensive Data-driven Analysis of Building Fire Risk

TL;DR

This study tackles the persistent risk of building fires in the United States by integrating NFIRS incident reports with social determinants, building inventories, and weather data to examine both fire occurrence and consequences. It deploys generalized additive models to map county-level occurrence and a CatBoost-based FireCat model to predict three consequence outcomes (fire spread, injuries, and economic loss) at fine spatial scales, with SHAP used for explanation. Key findings show that local vulnerabilities and incident-specific factors jointly shape risk: vulnerable communities experience higher fire occurrence and outcomes, while incident attributes (origin, ignition, heat source) predominantly drive consequences; detectors and AES substantially reduce spread and injuries, supporting targeted safety mandates and subsidies. The framework emphasizes local-context risk assessment for equitable fire prevention, identifies data-standardization needs, and points to future work in causal inference and natural language processing to further enhance predictive power and policy relevance.

Abstract

Building fires pose a persistent threat to life, property, and infrastructure, emphasizing the need for advanced risk mitigation strategies. This study presents a data-driven framework analyzing U.S. fire risks by integrating over one million fire incident reports with diverse fire-relevant datasets, including social determinants, building inventories, weather conditions, and incident-specific factors. By adapting machine learning models, we identify key risk factors influencing fire occurrence and consequences. Our findings show that vulnerable communities, characterized by socioeconomic disparities or the prevalence of outdated or vacant buildings, face higher fire risks. Incident-specific factors, such as fire origins and safety features, strongly influence fire consequences. Buildings equipped with fire detectors and automatic extinguishing systems experience significantly lower fire spread and injury risks. By pinpointing high-risk areas and populations, this research supports targeted interventions, including mandating fire safety systems and providing subsidies for disadvantaged communities. These measures can enhance fire prevention, protect vulnerable groups, and promote safer, more equitable communities.

Paper Structure

This paper contains 29 sections, 5 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Data foundation for nationwide building fire risk analysis.a The building fire-related dataset combines over one million NFIRS fire incident reports (2012–2022) with detailed incident-specific information on building characteristics, incident timing and location, firefighting efforts, and local information on demographics, social determinants, building inventories, business proportions, and weather conditions at monthly and hourly resolution. b Spatial distribution of building fire incident rates, defined as the number of fires spreading beyond the item of origin per 100,000 building units annually. The spiral chart illustrates the monthly distribution of building fire events across the U.S., with data normalized relative to the peak fire month of each year. c Classification of fire spread from the room of origin: confined to the room, floor, building, or beyond the building of origin. d Distribution of fire-induced injuries for incidents with reported injuries, including number of injuries and severity level. e Distribution of fire-induced economic loss, with classification into low, moderate, and high based on quantile thresholds at 40%, 75%, and 100% of the historical loss data.
  • Figure 2: Fire incident rates in the U.S. and corresponding effects of local factors.a Fixed effects of states on the fire occurrence in the GAM model, where a darker shade of red signifies a higher fire incident rates, and a darker shade of blue denotes a lower rates. b Partial dependence plots of the local (socioeconomic, building inventory, and business) and climate factors influencing building fire incident rates (n = 38804, excluding counties with events less than 3 in a month). ‘***’: variable significant at p < 0.001. ‘**’: variable significant at p < 0.01. ‘*’: variable significant at p < 0.05. The number on the right indicates the rank of the F-value. Miami-Dade County is highlighted on the plot with a blue dot. Results from seasonal (sub-figure c) and regional (sub-figure d) GAMs incorporating the most significant variables identified within each category: social determinant, business proportion, building inventory, and weather conditions. Details of the GAMs are provided in the Methods Section \ref{['sec:occurrences']}, with the complete results included in the Supplementary Section 3.
  • Figure 3: Performance evaluation and predictions comparison of FireCat across fire spread, human injury, and economic loss. Confusion matrix and prediction accuracy across various levels of prediction confidence of fire consequence models regarding fire spread (sub-figure a, d), human injury (sub-figure b, e), and economic loss (sub-figure c, f). (g, h, i) Comparison between the predicted and true distribution of fire consequences in Miami-Dade County and Broward County regarding fire spread (sub-figure g), human injury (sub-figure h), and economic loss (sub-figure i).
  • Figure 4: Factor importance analysis of FireCat for predicting fire consequences based on SHAP values.(a) Top eight most important incident-specific factors and local factors of the FireCat. The length of the bars represents the importance of factors, while the color shading distinguishes the incident-specific (light) and local (dark) factors. (b, c, and d) Sub-category of each incident-specific factor with the highest positive and negative influence on fire spread, human injury, and economic loss. Sub-categories with positive (negative) SHAP tend to increase (decrease) the predicted fire consequence levels. The complete spectrum of SHAP values for each sub-category within the incident-specific factors is detailed in Supplementary Figs. S5–S7.
  • Figure 5: SHAP value for top eight local factors influencing consequence levels of (a) fire spread, (b) human injury, and (c) economic loss. The y-axis lists the names of the eight factors, ranked by their mean SHAP values (light blue bars, top x-axis), which represent each factor's overall importance in the model. The bottom x-axis displays SHAP value contributions, showing whether a factor increases (positive SHAP) or decreases (negative SHAP) the predicted fire consequence levels. A color gradient (blue to red) represents the factor's actual value: blue indicates low values, while red indicates high values, corresponding to the factor's data range. For example, in the context of fire spread, low relative humidity (blue) strongly increases the consequence level, while high humidity (red) reduces it.
  • ...and 1 more figures