Approaches to plant safety continue to evolve based on lessons learned, as well as new automation standards and technology
In the chemical process industries (CPI), one incident can have a tremendous impact on the people in the plant, the communities around it, the environment and the production asset.
This article outlines how learning from past incidents continues to drive the development of both newer standards, as well as new approaches to process automation as it relates to plant safety and security.
Learning from incidents
Today, there is a lot of information available about process incidents and industrial accidents from sources such as the Chemical Safety Board (www.csb.gov), Industrial Safety and Security Source (www.isssource.com) or Anatomy of an Incident (www.anatomyofanincident.com). Regardless of the source, and considering the amount of public discussion that takes place, particularly following the very large and visible industrial incidents, it’s important to take the opportunity to learn and seek opportunities to improve and prevent these incidents from happening again (Figure 1).
The impact of incidents and accidents on people, the environment and plant assets is significant. According to a Marsh LLC (www.marsh.com) publication [ 1 ], there is evidence that the petrochemical sector suffered a terrible period in terms of accidents between 1987 and 1991 (Figure 2). The losses (property damage of the production assets, liabilities and so on), recorded in that period were about ten times worse than previous periods (1976–1986) and about 3.5 times worse than following periods (1992–2011).
On the positive side, the Marsh report shows that there has been improvement in the sector after 1992. This improved safety can be attributed, in part, to the introduction of the process safety management (PSM) standards.Taking a closer look, it is evident that the significant loss for the 1987 –1991 period was dominated by three explosion events, two of which were vapor-cloud explosions and account for 70% of the total losses for this timeframe. The key takeaway from this is that a single incident can have a tremendous impact on the people in the plant, the communities around it, the environment and, last but not least, the production asset.
In 1992, the U.S. Occupational Safety and Health Administration (OSHA; www.osha.gov) — the agency tasked with safety of personnel — issued the Process Safety Management of Highly Hazardous Chemicals standard (29 CFR 1910.199). This regulation set a requirement for hazard management and established a comprehensive PSM program that integrates technology, procedures and management practices. The introduction of this standard may be credited with improving process safety performance in U.S. hydrocarbon processing facilities.
Defining safety
In industry, safety is defined as a reduction of existing risk to a tolerable or manageable level; understanding risk as the probability of occurrence for that harmful incident and the magnitude of the potential harm. In many cases, safety is not the elimination of risk, which could be impractical or unfeasible.
Although the CPI must accept some degree of risk, that risk needs to be managed to an acceptable level; which in turn makes safety a societal term as well as an engineering term. Society establishes what is commonly accepted as safe and engineers have to manage risk by introducing risk-reduction methods including human elements, such as company culture and work processes and technologies that make the production facilities an acceptable place to work and a responsible neighbor in our communities.
The CPI has applied learnings from numerous events over the last 40 years. These incidents and accidents have resulted in changes to regulations and legislation and have driven the adoption of best practices that address the known factors at the root of those events.
A lot of the best practices are related to understanding and evaluating hazards and defining the appropriate risk reduction, including measuring the effectiveness of the methodologies or technologies used in reducing the risk.
Risk-reduction methods using technology — including digital systems — have received extensive coverage in trade publications over time as they are important contributors to process safety and plant productivity. However, it is critical to recognize human factors and their impact on process safety in the design, selection, implementation and operation of technology.
Connecting PSM and FS
Organizations, such as OSHA, recognize Functional Safety Standard ISA 84 as a Recognized and Generally Accepted Good Engineering Practice (RAGAGEP) and one way to meet the PSM requirements defined in 29 CFR 1910.199. Applying ISA 84 is more than purchasing a technology with a given certification or using a particular technology scheme or architecture. Industry best practices such as ISA 84 consider a great deal of applied learning. ISA 84 is a performance-based standard and describes multiple steps before and after selecting and implementing safety system technologies. These steps — commonly referred to as the safety lifecycle — are also the result of applying lessons learned from incidents and events.
Research (as documented in the book Out of Control [ 2 ]) has shown that many industrial accidents have their root cause in poor specification or inadequate design (about 58%). Additionally, users should consider that installing a system is not the end of the road, but rather another step in the lifecycle of the facility. Approximately 21% of incidents are associated with changes after the process is running, and about 15% occur during operation and maintenance.
ISA 84s grandfather clause
It is well-known that Functional Safety Standard ISA 84.01-2004 contains a grandfather clause based on OSHA regulation 1910.119. This clause allows users to continue the use of pre-existing safety instrumented systems (SIS) that were designed following a previous RAGAGEP, and to effectively keep its older equipment as long as the company has determined that the equipment is designed, maintained, inspected, tested and operated in a safe manner. As indicated by Klein [ 3 ], that does not mean that the existing system can be grandfathered and ignored from that point forward.
The intent of the clause is for the user to determine if the PSM-covered equipment, which was designed and constructed to comply with codes, standards or practices no longer in general use, can continue to operate in a safe manner, and to document the findings. Therefore, the emphasis should be on the second part of the clause, which states that “the owner/operator shall determine that the equipment is designed, maintained, inspected, tested and operated in a safe manner.” And that determination is a continuous effort that should be periodically revised until said equipment is removed from operation and replaced with a system that is designed in line with current best practices.
Another consideration is that the clause would cover not only hardware and software, but also management and documentation, including maintenance, all of which should follow current standards — that is, the most recent version of ISA 84 or IEC 61511.
Emerging technologies
The last few decades have seen technology changing in all aspects of humankind’s daily activities. Process automation and safety automation have not escaped from such changes (Figure 3). Nevertheless, technology-selection criteria should respond to the risk-reduction needs in the manufacturing facility and consider the improvements that some of these technologies offer, such as enabling better visualization of the health of the production asset.
The new breed of systems not only addresses the need to protect plant assets, but allows users to bring safety to the center stage, side by side with the productivity of the plant, in many cases by eliminating technology gateways and interfaces that were common a few years ago.
There are also new developments, particularly in software, that help prevent human errors in the design, and that guide users to fulfill industry best practices using standard off-the-shelf functionality. Off-the-shelf products avoid the introduction of error by complex manual programming and configuration.
Although productivity and profitability of many manufacturing processes limit the rate of change in the process sector, whenever there is an opportunity, facilities should explore modern technologies and determine if they are a good fit. One should not assume the system shouldn’t be touched behind the shield of “grandfather clauses” that are believed to justify maintaining the system “as-is.” Once again, despite the comfort provided by known technologies, such as general-purpose programmable logic controlers (PLCs), it is important to keep in mind that those platforms might not satisfy the current risk-reduction requirements in the facility and a significant investment to maintain the risk-reduction performance over the lifecycle of the plant asset micht might be required. Also, users will need to develop new competencies in order to understand new risk-reduction requirements and apply the next generation of technology accordingly.
Performance-based safety standards (IEC 61508 and IEC 61511/ISA 84) have changed the way safety systems should be selected. The days of simply choosing a certified product, or selecting a preferred technology architecture should be behind us; today’s system selection is driven by performance requirements and the risk-reduction needs of the plant.
Understand the hazards
Although this has nothing to do with the safety system technology, it is critical in the selection process to understand the scope of the process hazards and to determine the necessary risk reduction required. This should be done to create the safety requirements specification (SRS) necessary to start a system selection. Even when replacing an existing system, this is critical because the risk profile of the plant may have changed since installation.
Potential common-cause failures
There has been a long-standing requirement that a safety system must be different (or diverse) technology from its process-automation counterpart to avoid common-cause failures. But most safety systems rely on component redundancy (hardware fault tolerance [HFT]) to meet reliability and availability requirements, introducing a degree of common-cause failure directly into the safety system. Rather than redundancy, modern systems now provide a diversity of technologies designed into logic solvers and input/output (I/O) modules, along with a high degree of diagnostics, to allow a simplex hardware configuration to meet safety integrity level (SIL) 3 requirements.
Product-implementation diversity is also key. Even though most safety systems are manufactured by process-automation vendors, organizational diversity between the two product teams is only the first level of separation. Within the safety product team, leading suppliers will also be separating the design group from product-development group and then again from the product-testing group.
Systematic capabilities
Systematic capabilities address how much protection against human factors is built into the safety system. Users should look for the following:
Certified software libraries that offer functions according to the SIL requirements of the application Compiler restrictions to enforce implementations according to the SIL requirements User-security management to separate approved from non-approved users for overrides, bypass and other key functions Audit-trail capability to record and document changes to aid in compliance with functional safety standards
Separate, interfaced or integrated
Typically based on the SRS and other business needs, it is important to define one of these three integration philosophies. Integrated systems offer many key benefits, drawing on common capabilities of the process automation system not related to the safety functions directly. But only being interfaced or even kept completely separate are also options, and need to be thoroughly considered.
Protection layers
The use of multiple protection layers, or functionally independent protection layers (Figure 4) to be precise, is common in industry. These include technology elements such as the process control system and alarms. Safety instrumented systems are a last resort to prevent a hazard from escalating.
There are additional layers that mitigate the impact of a hazard or contain it. Once more, there are other layers of protection that are not based on technology, but on work processes or procedures that might be even more critical than the technology in use.
Most times, system interfaces are not designed, implemented or tested in accordance to industry best practices or current functional safety standards, and therefore they have an impact on the performance of the system. It has been common to ignore safety requirements on these interfaces. Failure of these interfaces should not compromise the safety system.
Integrated control and safety (Figure 5) is a modern alternative to previous point solutions that takes into consideration the best practices and solves issues related to interface design, implementation and maintenance, both in compliance to functional safety standards and at a lower cost over the lifecycle.
Network security
The extended use of networked systems is also territory for potential vulnerabilities. A lot of ground has been covered in this area over the last five years and industry has experienced the emergence of standards to address new threats and has the accelerated development of a strong relationship between safety and security. To satisfy the security requirements of a system network, the user should do the following:
Perform a full vulnerability assessment/threat modeling and testing of the different subsystems Define the best security mechanism for each of those subsystems to cover any identified gaps Perform a full vulnerability assessment/threat modeling and testing of the entire interfaced architecture
For users of an interfaced system, which could be “secured using “air-gaps,” the key is establishing a security management system (SMS) of the interface architecture and supporting it over the system lifecycle.
Defense-in-depth in security
The principle of defense in depth (Figure 6) means creating multiple independent and redundant prevention and detection measures. The security measures should be layered, in multiple places, and diversified. This reduces the risk that the system is compromised if one security measure fails or is circumvented. Defense-in-depth tactics can be found throughout the SD3 + C security framework (secure by design, secure by default, secure in deployment, and communications).
Examples of defense-in-depth tactics include the following:
Establishing defenses for the perimeter, network, host, application and data Security policies and procedures Intrusion detection Firewalls and malware protection User authentication & authorization Physical security
The key message is that, like in the case of safety, security is not resolved only by certification and it’s not an isolated activity after the product development is completed. Security is part of the design considerations early in the process and must be supported over the site lifecycle.
Summary
Although following the functional safety standards is not a silver bullet, it’s a good place to start the journey to improve safety in the process sector. If your industry requires compliance to OSHA regulation 1910.119, for the automation portion of any project, complying with the requirements of ISA 84 is a way to address PSM requirements.
Adopting ISA 84 is more than selecting a certified or SIL-capable logic solver or having a given redundancy scheme on the instrumentation. It requires a lifecycle approach that starts with the hazards analysis and defines the required risk reduction. It also involves evaluating technologies that better address the hazards and reduce the risk, as well as considering the technical requirements to mitigate risk to an acceptable level.
Although existing systems can be grandfathered, they can’t be ignored from that point forward. Rather, it is a continuous effort that should be periodically revised until the equipment is removed from operation and replaced with a system designed following current best practices.
When it’s time for selecting a new risk-reduction technology, consider that choosing a given technology scheme is not enough to address the functional safety requirements. Assuming that your existing technology or a “replacement in kind” still complies with the safety requirements of your process might lead to a “false sense of safety.” Consider the new breed of systems that not only addresses the need of protecting the plant assets, but allows users to bring safety to the center stage side to side with the productivity of the plant — in many cases by eliminating technology gateways and interfaces that were common a few years ago, therefore also reducing lifecycle cost and maintenance efforts.
The selection criteria should begin with a proper understanding of the hazards and a technology assessment to address human factors, avoidance of common factors that could disable the safety instrumented system, and the integration of process safety information to the process automation systems; this integration is possible and must be done right.
Like in the case of safety, security (or network security) is not resolved only by certification and it’s not an isolated activity after the product development is completed but part of the design considerations early in the process and that must be supported over the site lifecycle. n
Edited by Gerald Ondrey
References
1. Marsh LLC, The 100 Largest Losses 1972-2011: Large Property Damage Losses in the Hydrocarbon Industry, 22nd ed., New York, N.Y., 2012.
2. Health and Safety Executive (HSE), “Out of Control: Why Control Systems Go Wrong and How to Prevent Failure,” HSE, London, 2003; available for download at www.hse.gov.uk.
3. Klein, Kevin L., Grandfathering, It’s Not About Being Old, It’s About Being Safe, ISA, Research Triangle Park, N.C., 2005; Presented at ISA Expo 2005, Chicago, Ill., October 25–27, 2015.
4. Durán, Luis, Safety does not come out of a box, Control Engineering, February 2014.
5. Durán, Luis, Five things to consider when selecting a safety system, Control Engineering, October 2013.
6. Durán, Luis, The rocky relationship between safety and security, Control Engineering, June 2011.
Author
Luis Durán is the global product manager, Safety Systems at ABB Inc. (3700 W. Sam Houston Parkway South, Houston, TX 77042; Phone: 713 587 8089; Email: [email protected]). He has 25 years of experience in multiple areas of process automation and over 20 years in safety instrumented systems. For the last 12 years, he had concentrated on technical product management and product marketing management of safety automation products, publishing several papers in safety and critical control systems. Durán has B.S.E.E. and M.B.A. degrees from Universidad Simon Bolívar in Caracas, Venezuela and is a certified functional safety engineer (FSE) by TÜV Rheinland.