This systematically reviewed the evidence for efficacy of two-person checks for safety critical tasks in high-consequence industries outside of healthcare.
Just 9 studies met inclusion & quality criteria.
Providing background:
· “Arm doors and cross-check” is said to be a familiar statement when travelling by air. It draws on the principle of redundancy and is intended “to reduce the likelihood of system failures and increase reliability by requiring that a number of people or processes independently perform the same function” (p1)
· Effective checking requires “the allocation of attentional resources to compare the features of the system to the features expected at the particular point in time” but because this activity is cognitively demanding it’s possible that “the process becomes routine [and] attentional resources may be withdrawn, leading to checking errors. That is, the completion of routine tasks can become ritualised, so they are not performed with the degree of attention necessary to identify deviations” (p2)
· Despite the hypothetical merits of double checking, the research base is limited on its efficacy, nor the factors that support or hinder the process. A prior systematic review of effectiveness of double-checking to reduce medication admin error found inadequate evidence to support the process
· One study of 1523 inpatients found that independent double-checking was rarely performed as expected [** what we may call work-as-imagined…] – although “‘primed double-checking’, where one nurse shares information or primes a second nurse with information during the checking process, was highly prevalent” (p2)
· Despite the prevalence of primed double-checking (but not the mandated double-checking process), they found no significant association between primed or mandated double-checking and the rate or severity of medication admin error
Results
Overall, they conclude that: (quotes from page 5 or 6)
· 9 studies from aviation, chemical manufacturing and psychology “do not support the adage that two people are better than one” and that “the performance of two people checking was not superior to one person in detecting errors”
· Redundancy in high-consequence industries may be advantageous for technical systems but “it is less clear whether this is similar for human redundancy in healthcare, aviation, and chemical engineering”
· It’s difficult to determine the effectiveness of double-checking based on the current evidence since it is both small and of mixed quality
· As another limitation of the current research, many studies used uni students [** although one aviation study found that there was little difference between student and pilot samples for error rates and overall response patterns across automation failures, perhaps suggesting that, at least in that type of activity, uni student results may still be applicable to some degree]
They discuss findings from the 9 individual studies that met inclusion – I’ll just cover a few examples.
One aviation study of double-checking using a simulator found no significant difference in performance between single pilot and dual-pilot operations regarding automation failures. Another study, using uni students, studied automation bias in a computer simulation tracking of tasks involved in commercial flying. Individuals and pairs were equally likely to miss events and fail to respond to system irregularities.
Three studies investigated redundancy and automation monitoring in chemical plants – again all three studies used uni students. Participants who worked in parallel with a second person (the redundant group) cross-checked the automation significantly **less** frequently than the other group, which was supported by the other two studies.
That is, people working in groups designed to enhance redundancy and double-checking actually ended up checking less frequently than people working in isolation. [** a related theme here is the “fallacy of social redundancy”, as discussed by others like Sagan, Snook, Dekker].
Four psychology studies met inclusion. One study explored performance of people when working alone or in a “redundant system to identify errors in university transcripts”. Interestingly, “participants worked faster but made errors more frequently when in a perceived reliable, redundant system compared to a condition where they worked in isolation or in a perceived unreliable, redundant system” (p5).
They note that “the performance of two people checking was not superior to one person in detecting errors”, but based on the current evidence, which was relatively small in total number, statistical power (participants) and/or of mixed quality and findings.
It’s also difficult to conclude whether double-checking is more effective for error detection. Moreover, few studies considered individual factors for the interaction with technology, like self-confidence, workload and cognitive load – all found in other work to influence human performance with automated systems. Such factors could influence double-checking performance but were not evaluated.
However, one exception was that three studies of automation monitoring in chemical manufacturing “did find that double-checking was undermined by social loafing … That is, participants exerted less effort when they were aware that they were working with others who they expected would take greater responsibility during the task” (p5).
Finally, they explored the question around whether the type of double-checking matters. There’s two types of two-person checks:
1) Independent double-checking, which is often mandated when nurses administer certain types of medication. Here each nurse separately checks medication details
2) Collaborative double checking, used more often in aviation, is where one checker calls out information which is then confirmed by a second checker.
They speculate that perhaps collaborative double-checking is more effective for some safety-critical tasks whereas independent double-checking may be more effective for other activities. This is an understudied area in the research.
Another limitation is evaluating and controlling for the fidelity of checking processes in real-world settings. One study in healthcare notes that when nurses undertook double-checks during mediation administration, it was almost always the collaborative style between two nurses rather than the hospital mandated policy of independent double-checks.
In an aviation study from NASA that explored why flight checklists sometimes fail to catch errors and equipment malfunctions found 585 failures of the process in practice. In 43 instances a pilot confirmed a checklist prompt without visually inspecting the item, 42 instances where where checklist items were omitted or incorrectly called out, and 113 instances where a required verification wasn’t obtained. Thus, “these studies indicate that checking processes are not always completed as recommended” (p5).
However they also clarify that evidence based on real-world settings is thin, and evidence generally hasn’t thoroughly evaluated checking process fidelity or environmental factors, so it’s unclear whether double-checking itself provides no safety benefits or its more related to other factors.
They discuss the types of checking tasks and how this is relevant for study and evaluating double-checking, as different checking activities for different tasks will require different cognitive resources and factors.
They state that, for these purposes, double-checking tasks can be classified as:
· Mechanistic – such as comparing a prescription against an IV bag’s label
· Abstract – drawing on knowledge to realise a medication needs to be diluted in a different solution
· Monitoring – monitoring a system that autonomously controls chemical processes
The difference between the mechanisms behind each task are important distinctions. For example, declarative memory retrieval is “more negatively affected by stress than procedural memory retrieval” and therefore “stress may threaten abstract tasks more than mechanistic tasks”.
Finally, one study provided an estimation of the cost of double-checking in healthcare. By their estimations, the costs of a second nurse to double-check medication at one 340-bed hospital was >$7,000 per day or $2.7M annually. Applied to an Australian hospital with over 740 beds this cost are ~$5M per day nationwide.
Therefore, having good evidence about whether double-checking is having the intended benefits is critical, since this diversion of money could be directed elsewhere.
Authors: McMullan, R. D., Urwin, R., Wiggins, M., & Westbrook, J. I. (2023). Applied Ergonomics, 106, 103906.
Link to the LinkedIn post: https://www.linkedin.com/pulse/two-person-checks-more-effective-than-one-person-tasks-ben-hutchinson
2 thoughts on “Are two-person checks more effective than one-person checks for safety critical tasks in high-consequence industries outside of healthcare? A systematic review”