The Statistical Invalidity of TRIR as a Measure of Safety Performance

Another excellent piece of research from Matthew Holloway and team. In this report, the researchers statistically analysed over 3 trillion worker-hours of company incident data and sought to understand the statistical validity of TRIR/TRIFR and whether it was indicative of high-severity events.

I really can’t do this one justice so just skip my summary and go straight to the white paper (link below); but alas, I’ll take a stab at summarising key findings.

Results:

1. TRIR isn’t statistically associated with fatalities.

The results found no discernible association between fatalities and TRIR, with fatalities and recordable injuries following different patterns and occurring for different reasons. Because of this lack of association, TRIR “are not a proxy for high-impact incidents” (p12). Further to this, the authors argue that all of the safety activities associated with improving TRIR performance may not necessarily help to prevent fatalities.

2. TRIR is almost entirely random.

The results indicated that changes in TRIR “are due to 96-98% random variation” (p12). Authors discuss that recordables don’t occur in predictable patterns & is likely because safety is a complex phenomenon impacted by many factors.

Models were tested to see if historical TRIR predicted future TRIR performance. Found was that at least 100 months of data was needed for reasonable predictive power. It was argued that because TRIR is normally used to make monthly or annual comparisons, “this finding indicates that for all practical purposes, TRIR is not predictive”.

3. TRIR can’t be represented by a single point estimate.

TRIR represented as a single number doesn’t accurately represent safety performance because TRIR is almost entirely random. If used, it should be expressed as a range. The report has examples of what this would look like (e.g. instead of a TRIR of 1, a range of 0.18 to 5.66 is more accurate) & should be studied over extended periods of time.

Single point estimate TRIR broken down to decimal places over short periods of time (e.g. months to a year), are said to be “statistically meaningless for almost every organization” (p12).

4. TRIR isn’t precise and shouldn’t be communicated to multiple decimal points

Highlighting how statistically meaningless most calculations of TRIR are, the analysis revealed if a company was to report a TRIR of 1.00 with a precision of 0.1 (between 0.95 to 1.05 per 200k hours), the authors note that “about 300 million worker-hours of exposure time is required” (p9), and “unless the TRIR of 1.0 is was derived from 300 million worker-hours of exposure time, it should not be reported to even one decimal point” (p9).

Perhaps my favourite part of this paper follows where it’s shown that if you were to report TRIR to two decimal places (e.g. 1.29), you would need “approximately 30 billion worker-hours of data” (p9).

On this point, the authors state that, “The implication is that the TRIR for almost all companies is virtually meaningless because they do not accumulate enough worker-hours” (p12).

5. If TRIR is used for performance evaluations, it’s likely rewarding random variation.

On this point, the authors say that TRIR shouldn’t be used to track internal performance or compare companies, business units, projects or teams because the average company would require tens of millions of worker-hours for a statistically valid metric.

6. TRIR is predictive only over very long periods of time

As highlighted earlier, TRIR is only predictive when over 100 months of TRIR data are accumulated.

Honestly, my summary of this study sucked. Just read the original report and you won’t regret it.

Link: http://matt.colorado.edu/papers/StatisticalInvalidityOfTRIR.pdf

2 thoughts on “The Statistical Invalidity of TRIR as a Measure of Safety Performance

Leave a comment