Safe As podcast ep 12: Human performance in barrier/critical control systems

How do you consider the role of people within your barrier or critical control system – threat or adaptable element?

What are some fallacies of human performance, like being unreliable bad apples, and how best to incorporate the strengths of people, while limiting performance variability?

Today’s paper is from McLeod, R. W. (2017). Human factors in barrier management: Hard truths and challenges. Process Safety and Environmental Protection110, 31-42.

Make sure to subscribe to Safe As on Spotify/Apple, and if you find it useful then please help share the news, and leave a rating and review on your podcast app.

I also have a Safe As LinkedIn group if you want to stay up to date on releases.

This image has an empty alt attribute; its file name is buy-me-a-coffee-3.png

Shout me a coffee (one-off or monthly recurring)

Spotify: https://open.spotify.com/episode/5BACkpJMDDF8sOGlcze0Cw?si=okp4skJfS9C1dBWPHwaRhg

Apple: https://podcasts.apple.com/us/podcast/ep-12-human-performance-in-barrier-critical-control/id1819811788?i=1000718154331

Safe As LinkedIn group: https://www.linkedin.com/groups/14717868/

Transcript:

The unsung heroes of safety aren’t always in a rulebook. Often, it’s the continuous performance of people, plugging the gaps when our systems fall short. What are some better ways to consider the roles of people within various systems?

Grab a coffee. This is a longer pod. Good day everyone. I’m Ben Hutchinson and this is Safe As, a podcast dedicated to the thrifty analysis of safety, risk and performance research. Visit safetyinsights.org for more research.

Today’s paper is from McLeod, 2017, titled “Human Factors in Barrier Management, Hard Truths and Challenges in Process Safety and Environmental Protection.” This paper discusses some hard truths in the assurance of human performance. It said human performance continues to be relied upon as a control. Yet organizations may have miscalibrated ideas on how human performance can be relied upon, when needed or how. Normally, human performance is the overwhelming reason for successful outcomes, even in the face of poor systems and resources. Therefore, organizations may struggle to ensure that the performance of people that they rely on can reasonably be expected to happen, when and where, as needed, and that their expected controls are as robust as expected in the face of human performance variability.

First, a common claim following investigations is that, if only people had followed the rules, then the incident wouldn’t have happened. But this assertion relies on some implicit assumptions: that the organization actually had all of the required procedures it needs; the procedures are specific, accurate, clear, up-to-date, or even valued by people; the people have the knowledge, skills and training to know what procedures to leverage and when; that they’ll accurately recognize the situations that call for that procedure, and then carry them out under the conditions that exist at the time, where procedures are almost always written for a perfect world. They operate often in a vacuum that doesn’t exist in reality.

It’s said that some hard truths about how people will see and interpret the world and respond are hard because they can be difficult and inconvenient to design and manage for. But being hard doesn’t make them less valid or less important to address. So some of the hard truths discussed by the author are: human emotion, thought, performance and attitudes are all highly situated. They’re influenced by the situational context at the time. Design and layout of work systems, the equipment interfaces and the environment, influence how people sense and respond in the world. People optimize their performance even if it may be riskier in hindsight, and people are not necessarily rational in the classical sense. For instance, Type 1 and Type 2 thinking and pattern matching and heuristics, or how sensitive people are to loss aversion. This isn’t a bug though, it’s just a feature of humans.

So let’s cover some of the definitions that we’re using in this paper. In an earlier episode, I said I’d cover more precise definitions of controls, etc. Well, here we are. So control means measures that are expected to be in place to prevent incidents. Controls are comprised of barriers and safeguards. Barriers are types of controls that are assessed as being sufficiently robust and reliable, and they can be relied on as a primary control measure against incidents. They can be passive or active, and be a combination of human elements and technology and other engineering elements. In contrast to a barrier, safeguards are controls that support and underpin the availability and performance of barriers, but cannot meet the standards of robustness or reliability to rely on as a full barrier. So in other words, barriers are at the highest tier, whereas safeguards are often things that help support barriers or things that really don’t meet the same definition; they’re not as reliable.

Another distinction can be made around human barrier elements, or organizational barriers. This is when the company explicitly prescribes how decisions are to be taken, what is to be done by means of rules or instructions and procedures. There’s also operational barriers. These are when there’s no specifically prescribed manner of deciding or acting, where the individual is given quite a bit of discretion to take appropriate action. This relies more on operator skills and capabilities.

Right this second, there’s some people furiously typing an angry letter to me about the ICMM’s definition of critical controls. I like that approach too, but remember, town is big enough for more than one typology. And the approach presented today has decades of support from offshore oil and gas, but whatever floats your boat.

Next, the author covers some of the criteria for robust controls, particularly barriers, remember. Several factors should be met for controls to be classed as full barriers. These are: must be specific to a single potentially hazardous event (so specific); it must be independent of other protection layers (independent); if you can count them to do what it was designed to do (dependability); it’s capable of being audited, or observed (auditability).

Now this is a proviso there: these make a lot of sense for engineering systems, but are a little bit more challenging when we come into human performance. For instance, assuring true independence with human performance is a major challenge. Factors like workload and fatigue and distraction can defeat multiple controls. Organizational factors can influence control performance, like incentivizing certain outcomes or contractual arrangements. Independence achieved by having double checks by another person may also not be truly independent since it’s also affected by a range of factors. This is also being called the fallacy of social redundancy. So there’s a range of these factors and they’re said to often get overlooked when deciding how to assure human performance.

According to the author, judgments about the likely effectiveness of controls that rely on human performance often aren’t clear about what is actually expected or intended from people. So, we often don’t really clarify what we expect of human performance if it’s acting as some sort of control and how we even would assess the effectiveness. Things like the design of the work environment or equipment interfaces, if a control relies on someone opening or closing a valve, and a clear intention is necessary that people will know which valve to operate, how, when, and why to operate it. And the valve needs to be designed and labeled in a way to minimize the chance of people not operating it out of sequence.

The author draws on guidance from other human factors’ organizations, and then it’s argued that most organizational measures should be treated as safeguards rather than as barriers because they just don’t meet that higher reliability and effectiveness. So safeguards would include local warnings in science or design and implementation of alarms. Human machine interfaces, job design and more. Legal safeguards are more to ensure that the barriers that are expected to function are not degraded or defeated by other factors. If you’re familiar with bow ties, then things like escalation or degradation factors would be more akin to safeguards. So quoting McLeod, safeguards cannot and do not need to provide the same level of risk reduction as barriers. Nevertheless, safeguards should still have clear ownership, be capable of being audited, and be traceable to some elements of the organization’s management system.

Importantly, a control may be a barrier in one situation, but may be treated as a safeguard elsewhere in the organization if the company is unable or unwilling to invest the resources to ensure that that barrier functions to the necessary specifications. Again the author draws on some guidelines from a human factors organization and draws out eight concerns with human and organizational factors.

Also, top events in a bow tie: the top events are situated too far to the right where the events that are sought to be avoided are too close to the consequences. Too many barriers are identified, many of which don’t meet the accepted criteria for being a barrier. In fact, most of the controls that rely on human performance meet the definition of safeguards rather than as full barriers. Human and organizational factors are rarely incorporated into barrier models, and ideas of cognition and complexity are also rarely incorporated into the performance of barriers. “Workers imagined” versus “workers done” rarely weigh into considerations. Again, we often have a perfect world view around procedures and controls that work first time every time.

Human error is frequently identified as a threat in the bow tie and then barriers are identified to block the error from leading to the top event. Remember, this is a myth that we should be trying to avoid, not an endorsement that we should be seeing human error in that way. The implicit expectations of human performance are rarely made explicit, and barrier models are often designed and implemented to the workforce in a manner that doesn’t really properly support their operational use. We do safety to them rather than with them.

One interesting point from the paper is how human error shouldn’t be identified as a threat in the bow tie analysis, since it creates a misleading impression that the risk of human error is being adequately managed by barriers. Further, according to the paper, it can promote a focus on minimizing human performance variability over recognizing the real barriers and ensuring that they are as robust as can be. Focusing on human error in a threat line also removes human performance factors out of their context. Critically, treating people as a threat in the bow tie also misses the opportunity to develop a deeper understanding of the ways people provide flexibility and adaptability, and therefore contribute to the system resilience. Therefore, seeing human and performance variability as threats reinforces a negative view of people as unreliable factors to be managed.

Instead, more focus should be directed towards understanding the performance requirements of the interactions of people and technology and what’s needed to ensure the robustness of barriers and their escalation factors, which I would argue heavily involves learning from normal work and work analysis methods etc.

Next, it’s argued that barriers should consider the following factors: it should be considered the performance of the barrier and what it’s expected to deliver, and that should be specific to the threat in the situation. Who is involved in delivering the performance? Who detects the barrier? Who decides what needs to be done? Who takes action? What info is needed for the successful performance of the situation? What decisions or judgements are likely to be involved? What actions need to be taken and how will the operators know whether the actions have been successfully completed? Do they ever receive feedback during the task? Are there any other technical or non-technical guidance to be followed?

Next, the paper also looks at the standards for successful performance of the barrier. Arguing that this should include the maximum allowable time to detect an event to trigger the function, the accuracy of interpreting the event, the maximum allowable time to initiate a response, the acceptable reliability, tolerance limits for acceptable performance and some other factors.

So we’re right at the end. In all, the paper argues that people are nearly always a positive element in complex socio-technical systems. The objective should therefore be to strive to make people as reliable as possible by setting them up to succeed. Make it as easy as possible to do the right thing and as hard as possible to do the unsafe hazardous thing.

Organisations operating as complex socio-technical systems should seek to ensure they have in place the necessary systems and support structures and should design and operate their activities in ways that allow people to be as productive and adaptable as they can be.

Leave a comment