June 28, 2023 | Mark Paradies

Safety II and KISS – Keep It Simple Stupid!

KISS

Safety II vs. Safety I

One of our TapRooT® Instructors sent me a white paper written by three research academics about resilience and the human being a flexible part of the system. I had read about Safety II before and know the academics, but I thought that it was time to write a little about what I’ve observed – KISS or Keep It Simple Stupid.

The academics discovered that many root cause systems only look at failures rather than failures and successes. They called their “new” discovery and new ways to improve safety performance “Safety II.” They implied that it is clearly superior to “Safety I.”

The authors lay the blame for accidents in complex systems on poor thinking. They called this poor thinking Safety I. This poor thinking includes poor accident models (their examples: the Domino Theory and Reason’s Swiss Cheese Model) and linear cause-and-effect thinking in root cause tools (they mentioned TRIPOD, AcciMap, and STAMP as being defective).

TapRooT® RCA and Safety I & II

TapRooT® Root Cause Analysis doesn’t fall into the “Safety I” trap that the academics mentioned because we recognized resilience (but we never called it that) and humans as an active part of a successful system. When did we recognize this? Since TapRooT® RCA was first developed in the late 1980s. And we have suggested that people use root cause analysis proactively to analyze success and failure since the early 1990s. Therefore, we may be closer to “Safety II” than to “Safety I,” even though we appreciate what is good about both views of the world. Maybe we should say that we used the best parts of “Safety II” before it was invented.

You might ask, why aren’t we proclaiming the value of resilience and how TapRooT® RCA can handle complex accidents? Why don’t we jump aboard the “Safety II” bandwagon?

First, we don’t want to make root cause analysis seem any more complex than it already is. We built human factors knowledge and resilience theory into the TapRooT® System without trying to make the system seem more complex. We use understandable English and try not to invent too many new terms. TapRooT® RCA has always been built to handle complex systems and accidents, and we think that people understand that. But we also want people to be able to quickly use TapRooT® Root Cause Analysis to analyze simple problems (which is why we built the “low-to-medium risk investigation process and the 2-Day TapRooT® Course to teach the techniques).

If possible, we prefer simple systems. We hope that engineers design simple, linear systems that aren’t complex and highly connected. We realized the problems with complex systems as far back as 1984 when Charles Perrow wrote the book Normal Accidents. We included the work of Jens Rasmussen in the Root Cause Tree® Diagram. We included a discussion of his decision-making model in our 5-Day TapRooT® Course and used it to teach the advantages of simplicity.

After all, you don’t want to have a needlessly complex system that requires human heroics just to make it succeed. And that is where I think ”Safety II” goes astray.

Second, we like simple models when they work. They are easy to explain. (They don’t require a Ph.D. to understand them.)

Third, we disagree with some of the precepts of “Safety II.”

For example, the academics think that a needlessly complex system can be run successfully by well-trained, flexible humans. The operator’s heroics can make up for a needlessly complex design and poor planning. They believe that the “workarounds” invented by operators are a success that needs to be understood rather than a failure of planning that required the operators to “fly by the seat of their pants” because engineers and management produced an overly complex system without a well-designed procedure and human interface.

We believe that in highly complex systems, we need to apply our human factors skills to simplify and decouple the system to make it more reliable. The opposite is happening in many industries. Healthcare is a good example of the complexity problem.

Should we declare success because research shows that only 1 in 4 hospitalized patients experience harmful events? Should we be trying to learn from the success of 3 out of 4 that struggle to get it right? Or should we be changing the system by learning from the frequent failures?

Instead of simplifying needlessly complex systems and improving reliability, the “Safety II” folks think we need to understand how people (in the healthcare example in their article – doctors, nurses, and others) muddle through and get satisfactory results 75% of the time. Why? Because they think that the problems faced are “intractable” (unsolvable).

What do we think? KISS. We should be simplifying the system, reducing complexity, and applying best practices to improve reliability and, thus, reducing the “intractability” of the system. We need to make the human’s job straightforward (no heroics required just to get through his or her day).

Simple, linear systems are good. Easy to understand. Creating those systems or modifying complex systems to be simple should be one of our goals.

Results of Safety II

If you accept the theory of “Safety II” that:

  • You can’t really understand and plan work in complex systems.
  • Human variability is not an error but a normal part of the process that helps it succeed.
  • You can’t tell when something is working correctly or not (things are not “bi-modal”).
  • Accidents (adverse outcomes) aren’t solely the result of failures but rather are a combination of failures and normal performance variability.

Then you will find yourself mired in a complex system that only the most learned can comprehend.

But if this whole discussion seems difficult to understand, that is OK. Highly complex systems are difficult to understand. That is why high-reliability organizations try to reduce complexity and the interconnectedness of systems to make outcomes reliable. Keep It Simple Stupid! (KISS)

The solution to system complexity is not the complexity of “Safety II” but rather decoupling, simplifying, and making work understandable (not intractable).

Recommendations

So what do we recommend?

  1. Don’t make things more complex than necessary. SIMPLIFY whenever possible! KISS.
  2. You need to understand failures and successes.
  3. A few problems are due to intractable systems. In those few cases, the systems need to be fixed.
  4. Work isn’t as hard to understand as some might think – especially for those producing the outcomes.
  5. Of course, investigate significant events, but also be proactive! Investigate successes and precursor incidents.

There is considerable overlap between the five ideas listed above with “Safety II” … but also there are significant differences. The main difference is that we don’t have to accept the complexity, intractability, and interconnectedness/coupling of unreliable systems and hope that a human learns to cope with them. We CAN and should change the system!

_ _ _ _ _

This article was written in February 2019 and was slightly revised and republished because it is just as applicable now as it was then.

Categories
Human Performance, Root Cause Analysis Tips
-->
Show Comments

One Reply to “Safety II and KISS – Keep It Simple Stupid!”

  • billy morrison says:

    Very interesting. Reminds me of the UK safety engineer Trevor Kletz. He pointed out (many years ago) that students always learned about Chernobyl, Bhopal and Piper Alpha etc. In contrast nobody seemed to talk about plants and other industries where disasters did not happen.
    What are successful industries/plants doing right?

Leave a Reply

Your email address will not be published. Required fields are marked *