Reliability of Fire Alarm Systems

Fire alarm systems are reliable. But what does that mean? From a risk-informed, performance-based design standpoint, it means that when a designbasis event occurs, the fire detection and alarm system will perform as intended, with a predictably high frequency. Seldom is it necessary to quantify what that high frequency is. Consumer Reports does not provide relative reliabilities of different manufacturers' alarm systems as they do automobiles. Yet, there is an overall confidence that a fire alarm system consisting of listed components, designed by a competent engineer, installed by a competent vendor and inspected by a competent technician will perform as intended. These qualifying phrases are not disclaimers, but rather the key elements of assuring reliable fire alarm systems.

Defining Reliability
The SFPE Handbook of Fire Protection Engineering provides a background discussion on component and system reliability.¹The handbook defines "reliability" as the ability of an item (product, system, etc.) to operate under designated operating conditions for a designated period of time or a number of cycles. For fire alarm systems, as with many active fire protection features, reliability discussions must focus on both time, or useful life, and cycles, or performance during design-basis fire events.

Reliable system performance could be narrowly defined in terms of failure on demand. This would ignore unwanted or false alarms as a measure of reliability. Performance does not stop when the alarm sounds. So, the response of those reacting to the alarm must be considered, and unwanted alarms could affect the consistency or dependability of that reaction. However, it is worthwhile to separate the two responses, that of the alarm system and that of the responders, for these discussions.

Fire alarm systems consist of components performing a variety of functions to accomplish the design objective. In that regard, component reliability affects system reliability, but it is not system reliability. If a component failure were considered a system failure, adding redundant components would reduce system reliability rather than improve it. That would be equivalent to considering a Class A²supervised system half as reliable as a Class B system, since it has twice as many communication paths that could fail. However, since component failures can lead to system failures, the component level is a good place to start.

Component Reliability
Classic reliability references describe component reliability in terms of mean time to failures (MTTF), mean time between failures (MTBF) and in terms of failure rates. MTTF is the mean time from first use to the first and only failure of a component, and is sometimes referred to as that component's expected life. MTBF is the mean time between two consecutive failures and includes the time to discover and repair that failure as well as the time until it fails again. If detection of the failure is instantaneous, as with a supervised component, the major difference between MTBF and MTTF is the repair time.

Failure rates, or rate of occurrence of failure, typically can be plotted to form a curve in the shape of a bathtub. The bathtub curve implies that failure rates are highest in a system or component when just manufactured, when defects and damages manifest themselves. These failures are sometimes referred to as burn-in failures. Then, the failure rates are much lower and stable for the useful life of the device. The failure rates increase as the device reaches the end of its useful life when wear-out failures occur.

Ensuring reliable fire alarm systems throughout their useful life requires the implementation of quality processes in equipment manufacturing, system design, installation, programming, acceptance testing and subsequent inspection testing and maintenance. The quality process has to start with manufacturing.

The fire alarm industry has a long history of responding to the demands for better fire protection. Being innovative in a conservative field like fire protection has not always been easy, but the diligence of consensus code bodies, independent testing laboratories and equipment manufacturers has assured that fire alarm components and systems meet high quality standards. Generally, this has been accomplished without creating a significant impediment to innovation. NFPA 72 National Fire Alarm Code²and its predecessors NFPA 72A through E have established requirements for listing of components and compatibility of components with control units. That listing process is an essential and effective element in assuring the quality of components and reliability of systems.

Underwriters Laboratories Listing Process
Every engineer responsible for the design of fire alarm systems should become familiar with the standards and processes of Underwriters Laboratories (UL) listing. The process provides a structure and oversight to the manufacturing process that greatly improves component reliability. It also establishes performance expectations for fire alarm hardware. For example, when a new smoke detector is being brought to market, UL studies the device and completes a detailed technical description, which identifies the critical and semicritical parts of the device to recognize its failure modes.^{3, 4}UL requires drawings of the housing and restricts the modification of the housing without prior approval by UL, since the housing could influence smoke entry to the sensing chamber. UL requires the manufacturer to submit its Quality Assurance (QA) plan. UL tests the detectors to various fires to which each device must respond within a prescribed timeframe. As part of the production process, UL requires the calibration of every device at the factory. It further requires testing of several devices per shift in the smoke box as verification of the effectiveness of the calibration. UL provides follow-up by visiting the manufacturer at least four times a year and collecting random samples off the production line for retesting at UL at least once per year. UL 268³includes a section on Reliability Prediction and Criteria for Acceptance.

A similar process is followed for control units and accessories listed under UL 864.⁵In addition to testing the hardware for endurance and environmental effects, UL performs a thorough check of the software. Many of the system functions on analog-addressable systems are controlled by programming in the field. To minimize the likelihood of software malfunctions during field programming, UL even tests faulty keystrokes to assess their influence of the software response.

Manufacturers' Quality
The manufacturers of fire alarm systems and components implement quality programs that exceed the minimum requirements of UL. An important part of the quality process is the warranty on equipment. The warranty period offered by manufacturers varies from one to three years from the time of installation. As part of the quality assurance (QA) process, all warranty returned components are studied to determine the root cause of failure. If a failure is determined to be common to other components in the batch or a process, corrective actions are taken to mitigate the problem. It should be pointed out that the majority of component failures are discovered during system programming and commissioning. As such, they do not affect system reliability since the system is not yet in service. Based on discussions with the major manufacturers, a reasonable estimate would be that 0.1 percent to 0.5 percent of components are returned for warranty replacement, or 99.5 percent to 99.9 percent of components need no repair or replacement. This would be the high end of the component failure rate expressed as the burn-in failures. Manufacturers also calculate MTBF for various components as part of their QA trending. That is not a system reliability, but provides insight into the quality of the hardware.

System Reliability
The requirements of NFPA 72 establish a level of performance related to reliability. The fire alarm industry was decades ahead of other instrumentation systems by requiring circuit and component supervision and self-diagnostics. The establishment of Class and Style of circuits created an implicit selection of reliability. Virtually all circuits are supervised. Any loose connection, break or fault will be annunciated. In some cases, the circuits are still required to perform even under certain failure modes. Prior to the recombination of the NFPA 72 series of standards, NFPA 72D, Standard for Proprietary Fire Alarm Systems, required all circuits for initiation devices (IDC) to be Class B and all signaling line circuits, or circuits between panels and between buildings, to be Class A. This provided an implicit level of reliability based on the consequence of failure. One zone of detection devices could be impaired, but not multiple panels or multiple buildings. Later drafts introduced design options, but limited the number of devices that could be on a Class B supervised circuit. The current edition of NFPA 72²allows design decisions to be made on the level of performance of circuits. Still, if a failure occurs, it is announced within seconds and need not wait to be discovered until the next testing cycle. This is a significant feature for maintaining system reliability. In case one chooses to ignore the trouble alarm by silencing the alarm at the panel, the alarm is required to reannounce in 24 hours if not properly reset.

Another code-required reliability feature for fire alarm systems is the provision of reliable and backup power supplies for the system. The code requires the backup power supply to allow the system to function in standby mode for 24 hours, and then still be able to perform its alarm function for a period of time, either five minutes or 15 minutes, depending on the system.

When assessing system reliability based on component failures, the failure mode of the different components will have an influence. For example, if a photoelectric smoke detector fails because of a light source failure, that detector will not respond to a fire (although a trouble alarm will sound at the panel). However, other detectors on the same Class B circuit will respond. Depending on the system layout, this could mean anything from a slight delay in response time for smoke to travel to an adjacent device, to a fire enveloping the room of origin if the failed detector is the only one in the room. Circuit failures (loose connections, open circuits or ground faults) in Class B-supervised circuits will normally prevent devices downstream of the failure (beyond the failure away from the control unit) from responding (see Figure 1). This could have a more significant effect on system performance, again dependent on system layout and fire location. A similar analogy would apply for notification appliances; the system impact of a component failure depends on the failure mode of the component and the system layout and design. Failure of one appliance in an open-space office occupancy or a shopping mall may not create a significant delay in evacuation. However, the loss of a notification appliance circuit, again depending on the system layout, could have a greater effect. In both these cases, if Class A circuits are provided, the device failure or circuit failure would limit the effect of the failure to the localized component since a redundant path of communication is provided (see Figure 2). If a hot short in a circuit is a concern, isolation devices can be used to reduce the extent of the impairment.

Manufacturers have been innovative in building in safeguards to control units to minimize the effects of component failure. One such feature is promoted as operating in a degraded mode. In this condition, if a processor (CPU or loop controller) fails, the initiation devices will still activate and the control unit will still sound a general alarm.

When Bad Things Happen to Good Systems
The listing process, the manufacturing process and the code requirements are all significant contributors to reliable fire alarm systems. So what can go wrong? Before discussing engineering design and installation issues, it is important to point out the scourge of power transients. Lightning can be a cause of panel and component failure. Systems designed to function at 24 volts and milliamps of current need to be protected against surges. Most panels have built-in protective devices, but even these may be sacrificial and still not be enough to prevent damage. Surge suppression and isolation should be provided, as a minimum, in every circuit entering or leaving a building. It should be noted that warranties typically do not cover lightning damage.

Another potential issue affecting system reliability is modifications/additions to existing systems. Often, fire alarm systems are installed in dynamic facilities where changes occur. These changes can often require modifications of or additions to an existing system. These changes can introduce opportunities for unreliability. The simplest example is the need to take the system offline to incorporate tie-ins. If not managed properly, these impairments could lead to extended outages. Device compatibility can also be an issue. Manufacturers try to maintain backward compatibility of equipment, but sometimes subtle changes in design or firmware will create problems. These are solvable problems, but they may not manifest themselves immediately. Likewise, programming can create problems. When pushing a controller toward its limits, sometimes programming does not respond as intended. Again, these problems are readily discovered and fixed, but they emphasize need for the code requirement to test a percentage of existing devices after programming changes.

Role of the Design Engineer
The fire protection engineer has a vital role to play to ensure reliable performance of fire alarm systems. The first role is establishing the performance objectives of the system. In performancebased designs, the integrated fire protection strategy must respond to designbasis events (fire scenarios). These objectives establish expectations placed on the system and define what constitutes a success or failure on demand. The challenge going forward is developing accurate quantitative predictions of when detection devices respond (see box on nFPRF Projects on page 42). Failures are often reported anecdotally that people detected the fire before the smoke detector or the smoke detector never did go off because the fire was so small. These are not necessarily system failures, but rather faulty or mismanaged expectations concerning system performance. U.S. civil courts are littered with the contests regarding unfulfilled expectations. Here again, understanding the room fire testing done by UL under UL Standards 217 and 268 will help in establishing realistic performance expectations.

The second role is establishing the system reliability in terms of circuit integrity and performance. This includes not only the specification of Class and Style, but also the provisions of survivability. These performance attributes assure the system response, after detection, meets performance objectives. The requirements and options described in NFPA 72 are an excellent starting point. Additionally, evaluating the influences of component failure modes on system performance is useful to assure that sufficient robustness and redundancy are built into the design.

The third role is to help reduce unwanted and false alarms. To achieve this, the selection of devices and their location and placement can be critical. Through many iterations of NFPA 72 based on lengthy debates, guidance has been given in the code related to ambient conditions which could have an adverse effect on detector performance. The code also incorporated alarm verification. This feature allowed delays in alarm panel response to detector actuation to reduce the cases of transient conditions (e.g., puffs of dust or blasts of air), causing an unwanted system response.

Many discussions within the alarm industry have focused on unwanted alarms. Maintaining that delicate balance between device sensitivity and device stability (not prone to premature response) is a challenge. System designers should select the proper devices and locate them such that they can meet the fire protection design objectives without exposing the devices to non-fire stimuli capable of causing unwanted alarms.

Role of the Installer
With the code requirements and the manufacturer's instructions required by equipment listings, installation should be straightforward. Yet, the overwhelming causes of failures during acceptance testing are a result of installation problems, not hardware. Many alarm system panels will help find wiring problems, such as loose connections or ground faults. Still, pinpointing the locations of these problems can be a tedious endeavor. Often, conduit and wiring are installed by an electrical contractor rather than the fire alarm vendor. This makes for some exciting finger pointing when commissioning takes longer than scheduled and holds up certification. A particular problem is finding ground faults on shielded wiring. These faults may not always manifest themselves during acceptance testing, but can show up later. The good news is they seldom prevent the system from functioning, even though they keep the panel in trouble until the faults are found and corrected.

Inspection, Testing and Maintenance
The objective of inspection and testing is to discover component failures that could prevent adequate performance on demand and to discover those failures prior to a demand. One would expect that the frequency of these activities should be adjusted to the expected frequency of demand and failure rate of components. This is the concept of reliability-centered maintenance (RCM). While these are useful techniques to assess the adequacy of an Inspection, Testing and Maintenance (ITM) program, the code-required activities and frequencies take a one-size-fits-all approach. A more accurate assessment of the state-of-the-art fire alarm systems would indicate that more predictive maintenance and less inspection are needed. With the self-diagnostic nature of these systems, reliability would be ensured more effectively by the prompt response to and resolution of trouble alarms than increasing inspection frequencies. Inspection activities should focus on those features which are not supervised or monitored. These are usually configuration management issues, like new walls (impediments to smoke transport or sound), new or modified ventilation (impairments to detector response), and new or modified equipment (potential sources of smoke), as well as the obvious bags duct-taped over devices.

As part of a reliability-centered maintenance study of fire protection systems and equipment conducted for the U.S. Air Force, a Failure Modes and Effects Analysis was completed on various fire alarm system components.⁶These failure modes were characterized by the risk, based on probability of failure on demand and system degradation, by cause of failure and ITM tasks. One of the most obvious results of this exercise was the verification of the extent that system supervision was provided for almost every failure mode. The only failure mode for which constant monitoring was not provided was the physical condition of devices. The covering or blocking of the entrance to smoke detectors was one example of an element that was not supervised. Even in this case, current control units monitor the cleanliness of devices by monitoring their sensitivity and providing warnings when sensitivity is nearing the limits of the control unit's ability to compensate. This feature is intended primarily to reduce unwanted alarms, but could also identify some failure modes created by obstructions to the device.

Quantifying Reliability
The reliability of fire alarm systems in terms of response during emergency is ver y high. Based on the robustness of the components, the quality of manufacturing and the independent oversight of the listing process, it is not unreasonable to expect reliability greater than 99.9 percent. The number means that the system would fail to perform as intended for less than one fire in a thousand. This value is generally considered limited by the likelihood of human response to a trouble alarm of supervised components rather than the failure rate of components. How reliable is reliable enough for fire alarm systems? That depends on the risk, including the likelihood or frequency of fires and the consequence of system failure. The consequence will depend on fire development, other elements of the fire protection strategy and those (people or property) at risk. It is reasonable to conclude that a code-compliant fire alarm system as described in the introduction would be the most reliable part of the fire protection features, due in large part to the supervision and selfdiagnostics provided. With selective use of Class A circuits and isolation modules, those cases of highest risk could be designed to an even higher level of reliability.

In terms of human response to the alarm signal, specifically the influence of false or unwanted alarms, reliability takes on a different meaning. Given an alarm response, what is the likelihood people will respond as intended? Although that question is beyond the scope of this article, it is clear that if the probability of success is any less than 1.0, the reliability of the combined alarm/response outcome will be less than that of the system alone. How that response would vary with changes in frequency of false or unwanted alarms is an interesting human behavior problem.

Fire Protection Research Foundation (FPRF) Initiatives

Several ongoing and proposed research efforts organized by the FPRF are championing the cause for better engineering tools and better understanding of fire alarm systems. These include both detection and notification related projects. One project being conducted at UL will evaluate the smoke characteristics of a variety of materials to see if their "smoke signatures" are bounded by the smoke room tests of UL 217 and UL 268. Currently, smoke detectors are evaluated in the room test using three fire sources: ponderosa pine, newspaper and a heptane-toluene mixture. The research program will evaluate these materials in the cone calorimeter and a product calorimeter and measure smoke particle size distribution and gas concentrations. A cross section of other synthetic and natural materials is also being evaluated. The end result will not only be an assessment of the relevance of the UL room test fire to the variety of materials found in residential occupancies, but also will provide useful data for characterizing the response of smoke detectors to known quantifiable fire sources.

Another project, which hopes to build on these data, is currently in the development stage. This project will attempt to develop better predictive response models for smoke detectors for small non-flaming fires using NIST's Fire Dynamics Simulator (FDS). The ability to predict the response of smoke detectors with acceptable accuracy will remove one of the larger uncertainties of performance-based design of fire alarm systems. The project could also provide insight into detector placement with respect to known sources of nuisance alarms.

Another project still in the planning stage will specifically address measuring and predicting the reliability of fire alarm systems. It is hoped that this effort will provide sufficient statistical data to measure the impact of code changes and design issues on system reliability.

These are but a few of the FPRF efforts related to fire alarm systems. For more information, contact the Foundation or visit their Web site at www.nfpa.org/foundation

Keneth Dungan is with Risk Technologies.

References

¹Modarres, M., and Joglar-Billoch, F., "Reliability", SFPE Handbook of Fire Protection Engineering, Third Edition, National Fire Protection Association, Quincy, MA, 2002.
²NFPA 72, National Fire Alarm Code, National Fire Protection Association, Quincy, MA, 2007.
³UL 268, Standard for Safety Smoke Detectors for Fire Protective Signaling Systems, Underwriters Laboratories, Northbrook, IL.
⁴UL 217, Standard for Safety Single and Multiple Station Smoke Alarms, Underwriters Laboratories, Northbrook, IL.
⁵UL 864, Standard for Safety Control Units and Accessories for Fire Alarm Systems, Underwriters Laboratories, Northbrook, IL.
⁶MIL-HDBK 1117, Military Handbook: Inspection, Testing and Maintenance for Fire Protection Systems, 1 January 1999.

Reliability of Fire Alarm Systems

Contact Us