Most, if not all of the codes and standards governing the installation and maintenance of fire protect ion systems in buildings include requirements for inspection, testing, and maintenance activities to verify proper system operation on-demand. As a result, most fire protection systems are routinely subjected to these activities. For example, NFPA 251 provides specific recommendations of inspection, testing, and maintenance schedules and procedures for sprinkler systems, standpipe and hose systems, private fire service mains, fire pumps, water storage tanks, valves, among others. The scope of the standard also includes impairment handling and reporting, an essential element in fire risk applications.

Given the requirements for inspection, testing, and maintenance, it can be qualitatively argued that such activities not only have a positive impact on building fire risk, but also help maintain building fire risk at acceptable levels. However, a qualitative argument is often not enough to provide fire protection professionals with the flexibility to manage inspection, testing, and maintenance activities on a performance-based/risk-informed approach. The ability to explicitly incorporate these activities into a fire risk model, taking advantage of the existing data infrastructure based on current requirements for documenting impairment, provides a quantitative approach for managing fire protection systems.

This article describes how inspection, testing, and maintenance of fire protection can be incorporated into a building fire risk model so that such activities can be managed on a performance-based approach in specific applications.


"Risk” and "fire risk” can be defined as follows:

  • Risk is the potential for realization of unwanted adverse consequences, considering scenarios and their associated frequencies or probabilities and associated consequences.2
  • Fire risk is a quantitative measure of fire or explosion incident loss potential in terms of both the event likelihood and aggregate consequences.3

Based on these two definitions, "fire risk” is defined for the purpose of this article as quantitative measure of the potential for realization of unwanted fire consequences. This definition is practical because as a quantitative measure, fire risk has units and results from a model formulated for specific applications. From that perspective, fire risk should be treated no differently than the output from any other physical models that are routinely used in engineering applications: it’s a value produced from a model based on input parameters reflecting the scenario conditions. Generally, the risk model is formulated as:

Riski = Σ Lossi x Fi


Riski = Risk associated with scenario i
Lossi= Loss associated with scenario i
Fi = Frequency of scenario i occurring

That is, a risk value is the summation of the frequency and consequences of all identified scenarios. In the specific case of fire analysis, F and Loss are the frequencies and consequences of fire scenarios. Clearly, the unit multiplication of the frequency and consequence terms must result in risk units that are relevant to the specific application and can be used to make risk-informed/performance-based decisions.

The fire scenarios are the individual units characterizing the fire risk of a given application. Consequently, the process of selecting the appropriate scenarios is an essential element of determining fire risk. A fire scenario must include all aspects of a fire event. This includes conditions leading to ignition and propagation up to extinction or suppression by different available means. Specifically, one must define fire scenarios considering the following elements:

  • Frequency – the frequency captures how often the scenario is expected to occur. It is usually represented as events/unit of time. Frequency examples may include number of pump fires per year in an industrial facility; number of cigarette-induced household fires per year, etc.
  • Location – the location of the fire scenario refers to the characteristics of the room, building, or facility in which the scenario is postulated. In general, room characteristics include size, ventilation conditions, boundary materials, and any additional information necessary for location description.
  • Ignition source – this is often the starting point for selecting and describing a fire scenario, i.e., the first item ignited. In some applications, a fire frequency is directly associated to ignition sources.
  • Intervening combustibles – these are combustibles involved in a fire scenario other than the first item ignited. Many fire events become "significant” because of secondary combustibles, i.e., the fire is capable of propagating beyond the ignition source.
  • Fire protection features – fire protection features are the barriers set in place and are intended to limit the consequences of fire scenarios to the lowest possible levels. Fire protection features may include active (e.g., automatic detection or suppression) and passive (e.g., fire walls) systems. In addition, they can include "manual” features such as a fire brigade or fire department, fire watch activities, etc.
  • Consequences – scenario consequences should capture the outcome of the fire event. Consequences should be measured in terms of their relevance to the decision making process, consistent with the frequency term in the risk equation.

Although the frequency and consequence terms are the only two in the risk equation, all fire scenario characteristics listed previously should be captured quantitatively so that the model has enough resolution to become a decision-making tool.

The sprinkler system in a given building can be used as an example. The failure of this system on-demand (i.e., in response to a fire event) may be incorporated into the risk equation as the conditional probability of sprinkler system failure in response to a fire. Multiplying this probability by the ignition frequency term in the risk equation results in the frequency of fire events where the sprinkler system fails on demand.

Introducing this probability term in the risk equation provides an explicit parameter to measure the effects of inspection, testing, and maintenance in the fire risk metric of a facility. This simple conceptual example stresses the importance of defining fire risk and the parameters in the risk equation so that they not only appropriately characterize the facility being analyzed, but also have sufficient resolution to make risk-informed decisions while managing fire protection for the facility.

Introducing parameters into the risk equation must account for potential dependencies resulting in a mis-characterization of the risk. In the conceptual example described earlier, introducing the failure probability on-demand of the sprinkler system requires the frequency term to include fires that were suppressed with sprinklers. The intent is to avoid having the effects of the suppression system reflected twice in the analysis, i.e., by a lower frequency by excluding fires that were controlled by the automatic suppression system, and by the multiplication of the failure probability.


In repairable systems, which are those where the repair time is not negligible (i.e., long relative to the operational time), downtimes should be properly characterized. The term "downtime” refers to the periods of time when a system is not operating. "Maintainability” refers to the probabilistic characterization of such downtimes, which are an important factor in availability calculations. It includes the inspections, testing, and maintenance activities to which an item is subjected.

Maintenance activities generating some of the downtimes can be preventive or corrective. "Preventive maintenance” refers to actions taken to retain an item at a specified level of performance. It has potential to reduce the system’s failure rate. In the case of fire protection systems, the goal is to detect most failures during testing and maintenance activities and not when the fire protection systems are required to actuate. "Corrective maintenance” represents actions taken to restore a system to an operational state after it is disabled due to a failure or impairment.

In the risk equation, lower system failure rates characterizing fire protection features may be reflected in various ways depending on the parameters included in the risk model. Examples include:

  • A lower system failure rate may be reflected in the frequency term if it is based on the number of fires where the suppression system has failed. That is, the number of fire events counted over the corresponding period of time would include only those where the applicable suppression system failed, leading to "higher” consequences.
  • A more rigorous risk-modeling approach would include a frequency term reflecting both fires where the suppression system failed and those where the suppression system was successful. Such a frequency will have at least two outcomes. The first sequence would consist of a fire event where the suppression system is successful. This is represented by the frequency term multiplied by the probability of successful system operation and a consequence term consistent with the scenario outcome. The second sequence would consist of a fire event where the suppression system failed. This is represented by the multiplication of the frequency times the failure probability of the suppression system and consequences consistent with this scenario condition (i.e., higher consequences than in the sequence where the suppression was successful).

Under the latter approach, the risk model explicitly includes the fire protection system in the analysis, providing increased modeling capabilities and the ability of monitoring the performance of the system and its impact on fire risk.

The probability of a fire protection system failure on-demand reflects the effects of inspection, maintenance, and testing of fire protection features, which influences the availability of the system. In general, the term "availability” is defined as the probability that an item will be operational at a given time. The complement of the availability is termed "unavailability,” where U = 1 - A. A simple mathematical expression capturing this definition is:

A= u U= d =1-A
u+d, u+d,

where u is the uptime, and d is the downtime during a predefined period of time (i.e., the mission time).

In order to accurately characterize the system’s availability, the quantification of equipment downtime is necessary, which can be quantified using maintainability techniques, i.e., based on the inspection, testing, and maintenance activities associated with the system and the random failure history of the system.

An example would be an electrical equipment room protected with a CO2 system. For life safety reasons, the system may be taken out of service for some periods of time. The system may also be out for maintenance, or not operating due to impairment. Clearly, the probability of the system being available on-demand is affected by the time it is out of service. It is in the availability calculations where the impairment handling and reporting requirements of codes and standards is explicitly incorporated in the fire risk equation.

As a first step in determining how the inspection, testing, maintenance, and random failures of a given system affect fire risk, a model for determining the system’s unavailability is necessary. In practical applications, these models are based on performance data generated over time from maintenance, inspection, and testing activities. Once explicitly modeled, a decision can be made based on managing maintenance activities with the goal of maintaining or improving fire risk. Examples include:

  • Performance data may suggest key system failure modes that could be identified in time with increased inspections (or completely corrected by design changes) preventing system failures or unnecessary testing.
  • Time between inspections, testing, and maintenance activities may be increased without affecting the system unavailability.

These examples stress the need for an availability model based on performance data. As a modeling alternative, Markov models offer a powerful approach for determining and monitoring systems availability based on inspection, testing, maintenance, and random failure history. Once the system unavailability term is defined, it can be explicitly incorporated in the risk model as described in the following section.


The risk model can be expanded as follows:

Riski = Σ U x Lossi x Fi

where U is the unavailability of a fire protection system. Under this risk model, F may represent the frequency of a fire scenario in a given facility regardless of how it was detected or suppressed. The parameter U is the probability that the fire protection features fail on-demand. In this example, the multiplication of the frequency times the unavailability results in the frequency of fires where fire protection features failed to detect and/or control the fire. Therefore, by multiplying the scenario frequency by the unavailability of the fire protection feature, the frequency term is reduced to characterize fires where fire protection features fail and, therefore, produce the postulated scenarios.

In practice, the unavailability term is a function of time in a fire scenario progression. It is often set to 1.0 (the system is not available) if the system will not operate in time (i.e., the postulated damage in the scenario occurs before the system can actuate). If the system is expected to operate in time, U is set to the system’s unavailability.

Figure 1. Example of a Fire Scenario Progression Event Tree

In order to comprehensively include the unavailability into a fire scenario analysis, the following scenario progression event tree model can be used. Figure 1 illustrates a sample event tree. The progression of damage states is initiated by a postulated fire involving an ignition source. Each damage state is defined by a time in the progression of a fire event and a consequence within that time.

Under this formulation, each damage state is a different scenario outcome characterized by the suppression probability at each point in time. As the fire scenario progresses in time, the consequence term is expected to be higher. Specifically, the first damage state usually consists of damage to the ignition source itself. This first scenario could represent a fire that is promptly detected and suppressed. If such early detection and suppression efforts fail, a different scenario outcome is generated with a higher consequence term.

Depending on the characteristics and configuration of the scenario, the last damage state may consist of flashover conditions, propagation to adjacent rooms or buildings, etc. The damage states characterizing each scenario sequence are quantified in the event tree by failure to suppress, which is governed by the suppression system unavailability at pre-defined points in time and its ability to operate in time.

Francisco Joglar is with Hughes Associates.


  1. NFPA 25, Standard for the Inspection, Testing, and Maintenance of Water-Based Fire Protection Systems, National Fire Protection Association, Quincy, MA, 2011.
  2. SFPE Engineering Guide - Fire Risk Assessment, Society of Fire Protection Engineers, Bethesda, MD, November, 2006.
  3. Barry, T. Risk Informed, Performance Based, Industrial Fire Protection, Tennessee Valley Publishing, Knoxville, TN: 2002.