San Francisco Building Fires: Sprinkler System Effectiveness Analysis
Executive Summary
This report presents a comprehensive analysis of fire sprinkler system effectiveness in building fires across San Francisco. Using data from the San Francisco Fire Department's incident reports, we analyzed patterns in sprinkler system performance and developed a machine learning model to predict sprinkler effectiveness based on various factors.
The analysis revealed that sprinkler systems are highly effective, with approximately 88.5% of activations successfully controlling fires. Wet-pipe sprinkler systems, which account for 96.9% of all systems in the dataset, demonstrated the highest effectiveness rates. The machine learning model achieved 93.2% accuracy in predicting sprinkler performance, though class imbalance remains a challenge due to the rarity of ineffective activations.
Our temporal analysis revealed distinct patterns in sprinkler activations, with peak occurrences during evening hours (5-8 PM), particularly on weekdays. These patterns likely correspond to cooking activities, which are a common cause of building fires requiring sprinkler activation.
Key recommendations include prioritizing wet-pipe sprinkler systems in new building installations, implementing targeted maintenance programs for buildings with less effective system types, and scheduling inspections during peak fire incident hours. These insights can help fire safety professionals, building owners, and policymakers improve building fire safety and reduce the impact of fire incidents in San Francisco.
1. Introduction
1.1 Background
Fire sprinkler systems are a critical component of building fire safety infrastructure, designed to automatically detect and control fires in their early stages. Understanding the effectiveness of these systems in real-world fire incidents is essential for improving fire safety regulations, building codes, and emergency response strategies.
San Francisco, with its dense urban environment and diverse building stock, provides an excellent case study for analyzing sprinkler system performance. The city's comprehensive fire incident reporting system captures detailed information about each fire event, including the presence, type, and performance of automatic extinguishing systems.
1.2 Project Objectives
This analysis aimed to:
Identify and analyze building fires with sprinkler system activations in San Francisco
Evaluate the effectiveness of different types of sprinkler systems
Identify factors that influence sprinkler system performance
Develop a classification model to predict sprinkler effectiveness
Analyze temporal patterns in sprinkler activations
Provide actionable recommendations for improving building fire safety
1.3 Dataset Overview
The analysis used the San Francisco Fire Incidents dataset from the SF Open Data Portal, which contains detailed records of fire department responses. The dataset includes information on incident type, location, time, building characteristics, fire protection systems, and outcomes.
From the original dataset of over 700,000 incidents, we filtered to focus specifically on:
Building fires (approximately 31,365 incidents)
Incidents where sprinkler systems were present (a subset of building fires)
Incidents where sprinkler systems activated (586 incidents after removing undetermined system types)
This focused dataset provided the foundation for our analysis of sprinkler system effectiveness.
2. Methodology
2.1 Data Acquisition and Filtering
The analysis began with acquiring the San Francisco Fire Incidents dataset from the SF Open Data Portal. The raw dataset contained over 700,000 records with 66 columns, covering all types of fire department responses from 2003 to the present.
To focus specifically on building fires with sprinkler activations, we applied the following filtering steps:
Filtered for building fire incidents using the 'Primary Situation' column
Further filtered to include only incidents where automatic extinguishing systems (AES) were present
Further filtered to include only incidents where the sprinkler system operated
Removed incidents with undetermined system types
This filtering process resulted in a dataset of 586 building fire incidents with sprinkler activations.
2.2 Data Preprocessing
The preprocessing phase involved several steps to prepare the data for analysis and modeling:
Handling Missing Values: We identified and addressed missing values in key columns, dropping rows with missing values in critical fields like system type and performance.
Feature Engineering: We extracted temporal features from the incident date, including year, month, day, hour, and day of week, to analyze temporal patterns in sprinkler activations.
Target Variable Creation: We simplified the sprinkler performance categories into four main classes:
Effective: System operated and was effective in controlling the fire
Not Effective: System operated but was not effective in controlling the fire
Fire Too Small: System operated but the fire was too small to activate enough sprinklers
Did Not Operate: System should have operated but failed to do so
Feature Selection: We selected a combination of temporal features, building characteristics, and incident details as predictors for the classification model.
2.3 Exploratory Data Analysis
The exploratory data analysis (EDA) phase focused on understanding:
The distribution of sprinkler system types
The performance of different system types
Temporal patterns in sprinkler incidents
Relationships between system characteristics and performance
We used various visualization techniques, including bar charts, pie charts, cross-tabulations, and heat maps to identify patterns and relationships in the data.
2.4 Temporal Pattern Analysis
To understand when sprinkler activations most frequently occur, we conducted a detailed temporal analysis:
Created a heat map showing activations by hour of day and day of week
Analyzed patterns in activations by hour of day
Examined variations in activations by day of week
Identified peak activation times and low-activity periods
This temporal analysis provided valuable insights into when sprinkler systems are most likely to be activated, which has important implications for inspection scheduling and resource allocation.
2.5 Model Development
To predict sprinkler system performance, we developed a Random Forest classification model with the following approach:
Feature Preprocessing: We applied standard scaling to numeric features and one-hot encoding to categorical features.
Handling Class Imbalance: We calculated class weights to address the severe imbalance in the target variable, where "Effective" cases outnumbered "Did Not Operate" cases by a ratio of 274:1.
Model Training: We trained a Random Forest classifier with 100 trees, a maximum depth of 10, and the calculated class weights.
Model Evaluation: We evaluated the model using accuracy, precision, recall, F1-score, confusion matrices, and ROC curves.
Cross-Validation: We performed 5-fold cross-validation to assess the model's stability and generalizability.
3. Results and Findings
3.1 Sprinkler System Types
The analysis of sprinkler system types revealed:
Wet-pipe sprinkler systems are by far the most common, accounting for 96.9% of all systems in the dataset (568 incidents)
Dry chemical systems account for 2.2% of systems (13 incidents)
Dry-pipe sprinkler systems account for 0.5% of systems (3 incidents)
Other types (foam, special hazard) account for 0.3% of systems (2 incidents)
This distribution reflects the widespread use of wet-pipe systems in buildings, which are generally more reliable and less expensive than other system types.
3.2 Sprinkler System Performance
The analysis of sprinkler system performance showed:
Overall effectiveness rate across all system types: 88.5%
Wet-pipe sprinkler systems have an effectiveness rate of 88.7%
Dry chemical systems have an effectiveness rate of 84.6%
The performance categories were distributed as follows:
Effective: 548 incidents (93.5%)
Not Effective: 26 incidents (4.4%)
Fire Too Small: 6 incidents (1.0%)
Did Not Operate: 2 incidents (0.3%)
This distribution highlights the high overall effectiveness of sprinkler systems in controlling fires, but also reveals the severe class imbalance in the dataset.
3.3 Temporal Patterns
The temporal analysis revealed several important patterns in sprinkler activations:
3.3.1 Hour of Day Patterns
Sprinkler activations show a clear pattern throughout the day, with the highest frequency occurring during evening hours (5-8 PM)
Peak activation hour: 6 PM, with a significant number of incidents
Low activation periods: Early morning hours (2-5 AM) consistently show the fewest activations
The evening peak likely corresponds to cooking activities, which is consistent with cooking fires being the most common type of building fire
3.3.2 Day of Week Patterns
Weekdays (Monday-Friday) account for approximately 70% of all sprinkler activations
Weekends (Saturday-Sunday) account for approximately 30% of activations
The distribution across weekdays is relatively even, with slight variations
The temporal heat map reveals that the patterns of activations by hour vary between weekdays and weekends
3.3.3 Combined Temporal Patterns
The highest concentration of sprinkler activations occurs on Thursdays at 6 PM
Weekday evenings consistently show the highest activation rates
Weekend activation patterns differ from weekdays, with more distributed activation times
These patterns provide valuable insights for scheduling inspections and maintenance activities
3.4 Classification Model Performance
The Random Forest classification model achieved:
Overall accuracy: 93.2% on the test set
Cross-validation accuracy: 93.9% ± 1.1% across 5 folds
Performance varied significantly by class:
"Did Not Operate" class: 93.8% precision, 99.3% recall, 96.5% F1-score
"Effective" class: 0.0% precision, 0.0% recall, 0.0% F1-score
"Fire Too Small" class: 50.0% precision, 14.3% recall, 22.2% F1-score
"Not Effective" class: 0.0% precision, 0.0% recall, 0.0% F1-score
The model performed exceptionally well at identifying "Did Not Operate" cases but struggled with minority classes despite using class weights. This performance pattern reflects the challenges of working with severely imbalanced datasets in fire incident classification.
3.5 Feature Importance
The analysis of feature importance revealed that the most important predictors of sprinkler performance were:
Temporal features, particularly hour of day and day of week
Building characteristics, including property use and area of fire origin
Fire characteristics, including heat source and primary situation
These findings align with domain knowledge about factors that influence sprinkler effectiveness, such as the timing of fire incidents and the characteristics of the building and fire.
4. Discussion
4.1 Effectiveness of Sprinkler Systems
The high overall effectiveness rate (88.5%) confirms the value of sprinkler systems in building fire safety. When sprinklers activate, they are highly likely to control the fire effectively, potentially saving lives and reducing property damage.
The analysis also revealed that wet-pipe sprinkler systems, which are the most common type, demonstrate high effectiveness rates. This finding supports current building code requirements and industry practices that favor wet-pipe systems for most applications.
4.2 Temporal Insights and Implications
The temporal analysis provides valuable insights into when sprinkler systems are most likely to be activated:
Evening Peak: The concentration of activations during evening hours (5-8 PM) suggests that cooking activities are a significant trigger for sprinkler activations. This aligns with our finding that cooking fires are the most common type of building fire in the dataset.
Weekday vs. Weekend Patterns: The different patterns observed on weekdays versus weekends reflect variations in building occupancy and usage patterns. Residential buildings may have different peak times compared to commercial or office buildings.
Resource Allocation: These temporal patterns can inform resource allocation for fire departments, ensuring adequate staffing during high-risk periods.
Inspection Scheduling: Building inspections could be strategically scheduled before peak activation hours to ensure systems are properly maintained when they're most likely to be needed.
4.3 Challenges in Predicting Rare Failures
The severe class imbalance in the dataset presented significant challenges for the classification model. Despite using class weights to address this imbalance, the model struggled to accurately predict minority classes like "Not Effective" and "Fire Too Small."
This challenge reflects the reality of sprinkler system performance: failures are rare, making it difficult to gather sufficient data to model these events accurately. Future work could explore more advanced techniques for imbalanced classification or focus on specific subsets of incidents where failures are more common.
4.4 Limitations of the Analysis
Several limitations should be considered when interpreting the results:
Missing Coordinate Data: The dataset lacked latitude and longitude coordinates, preventing spatial analysis of sprinkler incidents across San Francisco.
Limited Sample Size: With only 586 incidents after filtering, the dataset provided limited examples of rare events like sprinkler failures.
Potential Reporting Bias: The dataset only includes incidents reported to the fire department, potentially missing minor incidents where sprinklers activated but the fire department was not called.
Limited Building Information: The dataset contained limited information about building characteristics, which could influence sprinkler performance.
Despite these limitations, the analysis provides valuable insights into sprinkler system effectiveness and factors that influence performance.
5. Recommendations
Based on the findings of this analysis, we recommend the following actions to improve building fire safety:
5.1 System Selection and Installation
Prioritize Wet-Pipe Systems: Continue to prioritize wet-pipe sprinkler systems in new building installations due to their demonstrated effectiveness.
Consider Building Use: Select appropriate system types based on building use, with special attention to buildings with high cooking activity.
Ensure Proper Installation: Follow industry best practices for system installation to maximize effectiveness.
5.2 Maintenance and Inspection
Targeted Maintenance Programs: Implement targeted maintenance programs for buildings with system types showing lower effectiveness rates.
Strategic Inspection Timing: Schedule inspections during or before peak fire incident hours (5-8 PM) to maximize system readiness when most needed.
Regular Testing: Conduct regular testing of all sprinkler systems to identify and address potential issues before fires occur.
5.3 Education and Training
Occupant Education: Develop educational programs for building occupants on proper use of cooking equipment, as cooking fires represent a significant portion of sprinkler activations.
Maintenance Staff Training: Ensure maintenance staff are properly trained in sprinkler system maintenance and testing procedures.
Emergency Response Training: Train building occupants on appropriate actions when sprinklers activate.
5.4 Resource Allocation
Staffing Based on Temporal Patterns: Allocate fire department resources based on the identified temporal patterns, ensuring adequate staffing during peak hours (5-8 PM).
Focus on High-Risk Times: Implement targeted fire prevention campaigns during high-risk periods.
Weekend vs. Weekday Planning: Adjust resource allocation strategies to account for different activation patterns on weekends versus weekdays.
5.5 Data Collection and Analysis
Improved Coordinate Data: Enhance data collection practices to include geographic coordinates, which would enable spatial analysis of sprinkler incidents.
More Detailed Building Information: Collect more detailed information about building characteristics to better understand their influence on sprinkler performance.
Continued Analysis: Conduct ongoing analysis of sprinkler performance data to identify trends and emerging issues.
6. Conclusion
This analysis of sprinkler system effectiveness in San Francisco building fires provides valuable insights for fire safety professionals, building owners, and policymakers. The high overall effectiveness rate of sprinkler systems confirms their importance in fire safety infrastructure.
The temporal analysis revealed clear patterns in when sprinkler activations occur, with peak times during evening hours (5-8 PM) and variations between weekdays and weekends. These patterns have important implications for inspection scheduling, resource allocation, and targeted education efforts.
The machine learning model successfully predicts sprinkler performance with high accuracy for the majority class, though class imbalance remains a challenge for predicting rare failures. The model identifies key factors influencing sprinkler effectiveness, which can inform targeted interventions and system improvements.
The findings highlight the critical role of proper system selection, installation, and maintenance in ensuring optimal sprinkler performance during fire incidents. By implementing the recommendations outlined in this report, stakeholders can work toward improving building fire safety and reducing the impact of fire incidents in San Francisco.
7. References
San Francisco Fire Department. (2023). Fire Incidents Dataset. Retrieved from SF Open Data Portal: https://data.sfgov.org/Public-Safety/Fire-Incidents/wr8u-xric
National Fire Protection Association. (2022) . NFPA 13: Standard for the Installation of Sprinkler Systems.
Hall, J. R. (2013). U.S. Experience with Sprinklers. National Fire Protection Association.
Ahrens, M. (2021). U.S. Experience with Sprinklers: An Update Using Data Through 2019. National Fire Protection Association.
Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
8. Appendices
Appendix A: Data Dictionary
Key columns used in the analysis:
Incident Number: Unique identifier for each incident
Incident Date: Date and time of the incident
Primary Situation: Type of incident (e.g., Building fire, Cooking fire)
Property Use: Type of property where the incident occurred
AES Present: Whether an automatic extinguishing system was present
AES Type: Type of automatic extinguishing system
AES Operation: Whether the system operated
AES Performance: How the system performed
Appendix B: Model Parameters
The Random Forest classification model used the following parameters:
n_estimators: 100
max_depth: 10
min_samples_split: 5
min_samples_leaf: 2
class_weight: Calculated based on class distribution
random_state: 42
n_jobs: -1 (use all available processors)
Appendix C: Visualizations
The analysis included the following key visualizations:
Distribution of sprinkler system types
Sprinkler system performance distribution
Effectiveness rate by system type
Temporal heat map of sprinkler activations by hour and day of week
Sprinkler incidents by hour of day
Sprinkler incidents by day of week
Confusion matrix for the classification model
ROC curves for each class
Feature importance chart
These visualizations are available in the accompanying Google Colab notebook.