Like other process control industries, the “alarm problem” for pipeline operators has been steadily evolving with the increased availability of field information that can be economically added to SCADA systems.

With this extra information comes considerably more alarms and alerts that, if left unchecked, can interfere with a controller’s ability to respond appropriately. In pipeline control centers around the world, it is not uncommon to see alarm summary screens completely filled with multiple pages of acknowledged and unacknowledged alarms.

Pipeline SCADA systems have some additional unique challenges for alarm handling due to the inherent latency of the data and the alarms associated with data reliability caused by intermittent communications outages.

Alarm overload factors

In addition to the rise in alarm activity caused because of the increased volume of information being supervised, the following factors represent some of the more common systemic issues that contribute to the alarm problem that the pipeline industry faces:

  • Running pipelines/units harder increases the need for alarms
  • Years of lower profits resulting in reduced levels of maintenance
  • Increased safety requirements due to incidents in the industry
  • Increased reliance on monitoring technology, including security monitoring
  • Downsizing causing work overload – not enough time to do complete all tasks
  • Availability of correct people to determine/configure alarms and settings.

Most of these factors are outside of the domain of pipeline operations, but all play a part in the overall alarm load for pipeline controllers.

Dealing with leak alarms

The most critical alarms for pipeline operators are leak/possible leak alarms. When a leak alarm is received, the controller immediately begins an investigation using all the tools available to verify that the information being presented is valid. “False alarms” may be generated depending on several factors. These include sophistication of the leak detection system employed; and also how communication problems, metering and telemetry uncertainty, combined with pipeline transients, are handled. False alarms are alarms none the less, and play a vital part in the safe operation of the pipeline.

The frequency of false alarms and the appropriate response to them should be part of any control room management program. Any alarm management program designed to reduce alarms should include a specific philosophy for leak alarms that may differ from other operational alarms.

Regulatory requirements

The demand for improved alarm management on pipeline SCADA systems stems from the “Pipeline Inspection, Protection, Enforcement, and Safety Act of 2006” (Pipes Act), enacted by the U.S. Congress on December 7, 2006. Section 19 refers to implementing the recommendations contained in the National Transportation Safety Board’s report entitled “Supervisory Control and Data Acquisition in Liquid Pipelines” (adopted by the NTSB November 29th 2005).

The Pipes Act states that the Secretary of Transportation shall issue standards to implement the following NTSB Recommendations:

  1. Implementation of the American Petroleum Institute’s Recommended Practice 1165 for the use of graphics on the supervisory control and data acquisition screens.
  2. Implementation of a standard for pipeline companies to review and audit alarms on monitoring equipment.
  3. Implementation of standards for pipeline controller training that include simulator or non computerized simulations for controller recognition of abnormal pipeline operating conditions, in particular, leak events.

Last September, the PHMSA released its NOPR (Notice of Public Rule making) covering alarm management and other control room management issues. This NOPR was entitled “Federal Register Department of Transportation Pipeline and Hazardous Materials Safety Administration 49 CFR Parts 192, 193, and 195 Pipeline Safety: Control Room Management/Human Factors; Proposed Rule.”

Highlights of proposed rule

The proposed rule cover operators of gas pipelines, hazardous liquids pipelines and LNG facilities, and focuses on the control room management (CRM) issues identified in the Pipes Act and the requirements for a CRM plan. The control room management plan must address many items, including:

  • Several detailed provisions relating to alarm management ensuring controllers will respond appropriately to alarms and notifications. SCADA operations must be reviewed at least once per week. SCADA configuration and alarm management operations must be reviewed at least once each calendar year at intervals not to exceed 15 months, and they must include identification of abnormal or emergency operating conditions and a review of controller response actions.
  • Requirement to provide adequate and accurate information to controller. This requirement includes a point-to-point baseline verification between field equipment and all SCADA system displays to verify 100% of the displays. Baseline verification must be completed by one year after final rule issuance for operators with less than 500 miles of pipeline. For operators with 500 miles or more, they are given three years after final rule issuance to complete this baseline verification.
  • Recording of critical information during each controller shift and establishing “sufficient overlap” of controller shifts to permit the exchange of necessary information.
  • Change management that establishes communications between controller, management and field personnel when planning and implementing physical changes to pipeline equipment and configurations. SCADA system modifications are to be coordinated in advance to allow time for controller training and familiarization.
  • Establish a definition or threshold for “close-call events” for the purpose of evaluating event significance. Significant events must have a review conducted and the operator must share the information with all controllers.
  • A standard incorporated by reference is API RP-1165 (SCADA Display Standards); the proposed rule requires pipeline operators with SCADA systems to follow API RP–1165 in its entirety, or be able to demonstrate that the recommended practice is inapplicable or impracticable.
  • Each operator must have their CRM plan validated and signed by a senior executive officer.

Alarm handling versus management

SCADA systems have varying degrees of sophistication in their native alarm-handling capabilities. With the ultimate goal of reducing the alarm load on controllers, along with improving the situational awareness that results from less alarm noise, a comprehensive alarm management life-cycle will look for opportunities for improvements in both the handling and the management of the alarms generated.

Alarm management life cycle

The ISA SP.18 Instrument Signals and Alarms guidance document that is currently under development suggests a design and operation workflow for a complete alarm system life-cycle:

  • Design and implement
  • Monitor performance
  • Assess/audit/maintain
  • Manage change.

These items are discussed in detail below.

Design and implement

The design and implementation phase of an alarm management program consists of the identification or establishment of the current alarm activity benchmarks, followed by an initial rationalization exercise to begin the process of alarm improvement.

Rationalization is the process of reviewing a candidate alarm against the principles of the alarm philosophy and documenting the rationale for the alarm. Rationalization alone only provides some alarm reduction, along with improved alarm documentation. These results have proven to be temporary at best, since they do not resolve the issues that allowed the alarm system to get to its current state. Alarm design documentation does not necessarily lead to good alarm system performance. Rationalization will determine the design parameters that are to be implemented in the first iteration of the program.

Monitor performance

There are two key concepts in monitoring performance: derived alarm performance metrics and controller alarm loading. They are described in detail below.

Derived alarm performance metrics. Alarm performance benchmarks are typically designed to measure the effect of alarm activity on controller performance. Human factors research clearly states that too much information is just as harmful as too little. In the context of SCADA alarms, that realization generates the need for a way to determine the “sweet spot” between too many alarms (too much information), and too few.

In addition to alarm activity metrics used to find the “appropriate” amount of alarm activity, there are other classes of metrics that accurately expose shortcomings in alarm design and implementation. These metrics can help operators identify those alarms having little or no operational value. Using a mix of the various types of alarm performance metrics builds a better foundation from which to evaluate the performance, health, and value of an alarm system.

Controller alarm loading. The most widely accepted mechanism for determining the appropriate amount of alarm activity for a given operation is through an evaluation of controller loading. Monitoring the impact of alarms on an operating team helps determine the maximum thresholds for alarm activity in the context of all other operator responsibilities. Operators can do their job more effectively when alarm activity complies with established benchmarks. The common key performance indicators (KPIs) used to evaluate the performance of an alarm management program include the following:

  • Manageable steady state – the maximum rate at which a single operator can effectively address alarms.
  • Flood state – the rate at which a single operator is overwhelmed by alarm activations.
  • Average process alarm rate – the average rate at which a single operator can be expected to perform as required.
  • Percentage of time alarms exceed target average rate – the percentage of time that alarms exceed the target average alarm rate.
  • Peak alarm hourly rate – the target peak hourly rate for the most active hour within the evaluated time period.
  • Peak alarm minute rate – the target peak minute rate for the most active minute within the evaluated time period.
  • Alarm activity priority distribution – The suggested approximate distribution of alarm activity by priority. (i.e., 5% high priority, 15% medium priority, and 80% low priority).
  • Alarms within 10 minutes of a major upset – the maximum rate in the 10 minute period following a major upset.
  • Chattering alarms – encourage proper maintenance and the use of alarm logic such as signal deadbands and filters.
  • Stale alarms – encourages the evaluation of alarms that remain active for an excessive period of time.
  • Average alarms/controller – a configuration target aimed at encouraging design discipline.
  • Unauthorized changes to alarm settings – encourages a strong management of change policy.

Thoroughly understanding the types of alarm key performance indicators facilitates a positive influence on good alarm system design and good alarm management practices. The richness of alarm activity data, joined and contrasted with other data resources, yields compelling insight into alarm system health and exposes opportunities for productivity gains through changes to the alarm design.

Maintaining control of controller loading is the foundation of alarm management. Complementing alarm activity metrics with additional event and configuration-based analysis strengthens the alarm management process and improves the efficiency with which alarm-related problems can be located and addressed.

Assess, audit and maintain

An important part any alarm management life-cycle is the auditing and assessment of the program. This becomes mandatory under the PHMSA Control Room Management/Human Factors Proposed Rule. The proposed rule requires that operators undertake a detailed review of alarm configuration and management that monitors the number of alarms, potential systemic issues related to field equipment or the SCADA system, issues resulting in excessive or unusual alarms, unnecessary alarms, changes in controller performance in response to alarms, and a review of alarm set-point values.

Pipeline operators must use this information to evaluate and mitigate controller workload with respect to the number and nature of alarms received. Alarms indicating ongoing maintenance issues or communication problems should be resolved. It is important not to ignore known problems that continually cause alarms.

For regulatory compliance, an automated “alarm assessment report” strategically illustrates how an alarm management program has met the requirements of the rule by providing easy access to the key alarm performance indicators. As the report cycle progresses, you will be able to demonstrate improvement over time and identify additional improvements that can be made based on historical comparisons.

Management of change

As part of an alarm management program, alarms will be identified that will require revised limits; priorities changed to match the new alarm philosophies; and the configuration of alarm suppression for related alarms may be added. Changes will be made on the live system that will directly affect the behavior of the alarm system. It is crucial to have a sound management of change process for the SCADA system that provides an auditable record of these configuration changes. More importantly, this management of change should include a process for informing controllers of changes made before they are put into effect on the live system, especially controllers who are not on shift at the time the changes are made.

Conclusion

As a result of new regulations, operators are now focusing on the good engineering practice of sound alarm management. The old days of pipeline controller consoles filled with an overwhelming amount of alarms will soon be gone. In addition to reduced controller stress, improved alarm management will deliver:

  • Quieter control rooms
  • Better alarm flood control/avoidance
  • Maintenance savings (industry feedback indicates approximately +/- 5%)
  • Increased pipeline uptime (avoidance of unplanned outages)
  • Better use of assets (one pipeline shutdown can delete the planned gains from a year of process improvements)
  • Regulatory compliance.

Remember that alarm management is better suited to be incorporated as a standard practice. Alarm management is never really finished; it becomes an ongoing natural process as part of maintaining a good control room management program.