site map Root Cause Methodology and Tools for Improved Operations
Home
About TapRooT®
Course Info
Summit Info
Software
Equipment Troubleshooting
Weblog
Store
Support
Contact Us

The Curse of Apparent Cause Analysis

200602251808

This Root Cause Network(TM) Newsletter article has received attention from many. Some have called me to praise the idea of more attention to serious incidents and proactive improvement. Others have questioned my views about NOT analyzing near-misses with “apparent cause analysis.”

Here is the article. Please add your comments by clicking on the comments link below.

The Curse of Apparent Cause Analysis

If you are not in the Nuclear Industry, you probably haven’t heard of Apparent Cause Analysis. If you are in the Nuclear Industry, you should wish that you never had. Why do I call Apparent Cause Analysis a curse while others say that it is a good practice? Read on to fully understand.

In almost every industry, the most frequent question asked by people that are doing good root cause analysis is:

Isn’t there an easier way 
to do this for smaller incidents?

That question led the Nuclear Industry to Apparent Cause Analysis. The idea is that you can skip some of the rigor of real root cause analysis, not ask as many questions, assume some facts, and even occasionally guess at the most likely cause and the result will be good enough to develop effective corrective actions and trend. Now for the bad news:

You get what you pay for.

When you take shortcuts, skimp on the facts, don’t ask all the questions, and guess, the result is a poor analysis that should NOT be used to develop corrective actions or trending. Why? Because you will trend and correct the investigator’s assumptions and guesses. They may be good assumptions and guesses, but they may also be bad.

I tell people if saving time is their highest priority, why not use an even faster technique … Spin-a-Cause™!

200602251817

But some say,

“Implementing improvement based on our engineers’ best guess is better than doing nothing at all.”

My answer? NO IT ISN’T!

Look at the iceberg model and think. A nuclear plant does 1 to 5 good root cause analyses per year. But they do 100s or even 1000s of short-cut analyses. The 1000s of corrective actions based on guesses & assumptions are driving their improvements. Is that why nuclear managers complain that improvement programs aren’t cost effective?

So what should companies do? Here’s my advice:

1. Expand good root cause analysis to all incidents. (That’s about 100 incidents per year). It is cost effective.

2. Learn to be efficient in your root cause analysis efforts without taking shortcuts. (Article on this is at the Root Cause Analysis Blog Site.)

3. Stop doing analysis on near misses and instead, categorize the occurrence types and watch for adverse trends.

4. Take the effort you save from not doing 1000 short-cut analyses and not fixing the assumed problems and put that effort into a targeted PROACTIVE improvement program based on good root cause analysis.

Need to learn more? Attend the 5-Day TapRooT® Course or talk to me at the TapRooT® Summit. But stop fooling yourself about getting something for nothing!

8 Responses to “The Curse of Apparent Cause Analysis”

  1. John Says:

    John Sargaison from Santos sent the following comment:

    Mark

    I would like to challenge your third recommendation re: stop analysing near misses.

    I should start by saying there are near misses and then there are NEAR MISSES.

    There are two broad categories that you can split incidents/near misses into - high frequency, low consequence and low frequency high consequence events. Of course with near misses you are working on the likely potential consequence rather than an actual consequence. The latter category (low frequency high consequence incidents and near misses) are typically rare in comparison to high frequency, low consequence events (incidents and near misses).

    A low frequency high consequence near miss should receive the same time and effort that would be afforded to an actual significant incident as they will identify root causes which are likely to be different than those that will be found in actual high frequency, low consequence incidents.

    Relying on fixes to root causes derived from investigations of high frequency, low consequence incidents to prevent low frequency high consequence incidents is folly. True insight into system weaknesses that could lead to high consequence events such as loss of process containment in an oil and gas facility will primarily be provided by thorough analysis of the relatively rare low frequency, high consequence near misses and incidents.

    I would rewrite point three along the lines:

    Stop doing analysis of low potential consequence near misses…

    Food for thought.

  2. Mark Paradies Says:

    I agree completely.

    But the potentially high consequence near-misses should be called incidents and therefore should be included in the investigations that are already performed. (Top part of the iceberg.)

    I am going to post the newsletter link on the blog site. Can I post your comment?

    Mark

  3. John Sargaison Says:

    Mark,

    Yes please post my comment – discussion of these issues is always beneficial. Semantics I know but a near miss by definition is an event that did not result in injury or damage, but which, when assessed, had the potential to have done so. A high consequence event that did not result in injury or damage is still by definition still a near miss.

    Thanks for taking the time in preparing the articles that you do.  I always enjoy reading them and the subsequent discussions that typically flow. The issue of TapRooT(R) being too resource intensive for low consequence incidents is by far the issue that causes me the most discussion with management. 

    Regards
    John Sargaison
    Chief Health and Safety Adviser
    Corporate EHS&S

  4. Mike Hassell Says:

    Mark,

    I think I understand your disdain with apparant cause analysis, but I believe it may be a little mis-focused. Our differing view points may be founded in the purpose behind performing apparent cause analysis to start with.

    The performance of apparant cause should not be to provide for a simply short cut method to identifiy corrective actions, but rather to provide for a reasonable basis for application of failure mode coding associated with high frequency low consequence events. These codes in turn support the ability to have meaningful data for a trend analysis program.

    The problem with apparent cause analyses that you may be reacting to is the willingness of an organization to make changes to their programs, organizations, or processes based on what you call a short cut approach.

    Correcting/changing a program, organization, or process should be an action taken only when the level of certainty is high that you are addressing the “cause” of a problem and that you are not actually creating the cause of the next problem. A well performed and substantiated Root Cause analysis usually brings the organization to that appropriate level of confidence necessary to support the changes to remove error likely conditions or systemic weaknesses that resulted in the event - however, an apparant cause analysis rarely provides that level of confidence. As such, there should be a reluctance to take “corrective” actions beyond the remedial actions needed to address the adverse condition based on a single event related apparant cause analysis. To do otherwise results in the hundreds of corrective actions that really don’t provide any improvement in performance - and most likely cause a decrease in performance.

    In summary, I believe the warning associated with whether or not to perform an apparant cause is more appropriately contingent on whether or not the peforming organization has the restraint to not “fix” every problem identified by an apparent cause analysis. Lacking that restraint, I would agree that embracing the apparant cause analysis approach would, in itself, be an error likely strategy.

  5. Mark Paradies Says:

    Thanks for your comments.

    I agree with you - mostly … and I think you agree with me, mostly …

    First, your approach to Apparent Cause Analysis sounds different than most I’ve observed. If your data that you produce is meaningful and is not root cause related, then you do have a different approach than what I’ve seen others take.It sounds more like what I’m proposing for incidents not worthy of analysis.

    The results that I’ve seen of nuclear utilities and their approach to apparent cause analysis sounds like your description of what is a “error likely strategy.” Maybe I’ve just seen bad examples?

    From those bad examples, I’m not convinced that most apparaent cause analysis as practiced in the nuclear industry produces statistics worthy of trending.

    Apparent Cause Analysis Trending would probably be better than trending random numbers but how much better? I don’t think I would have confidence as a manager or a regulator to base decisions about improvement on the Appratent Cause Analysis data produced by the methods I’ve seen or heard about.

    Therefore, as I see the results, I’m not convinced that the effort spent is worthwhile.

    I would guess that a good Apparent Cause Analysis (I cringe at the term “good Apparent Cause Analysis”) and Root Cause Analysis aren’t that far appart in effort. So why not do an investigation worthy of some management confidence? Why not spend the extra effort to identify real root causes?

    I’m not sure why I’m so adverse to Apparent Cause Analysis except that is seems such a shame to fool yourself that short-cuts can lead to quality results. And the way I’ve seen Apparent Cause Analysis used fits the description of a short-cut and I haven’t seen it produce reliable data.

  6. Wayne Pennycook Says:

    Hi Mark,
    Interesting discussion…and as an advocate of more proactive reporting of hazards and near misses…one that I’ve been observing for years, and still haven’t been successful at getting full value from an increased level of reporting.

    The value I’m striving for is to make improvements to the underlying systems and processes that will eliminate the repeating patterns of behavior. The barrier to getting value from this level of reporting typically is that we tend to apply the same analysis process for individual near misses/hazards with low potential as we would for an incident of consequence, which tends to bog down in process when you have 1-200 events/month (I know one company that reports 1000 behavior observations/month - essentially near misses, if you support the theory that the essential difference between a hazard and near miss is human interaction….and I realize that isn’t always the case (Directionally accurate and precisely wrong)).

    I think the opportunity is to ‘categorize’ the individual event data, look at the common underlying systems/processes and apply root casue logic to that level of thinking.

    One example I think about; the monthly near miss data pointed to poor quality morning tail-gate/task hazard analysis process….and the quality apparently deteriorated because the work crew were continuing with work in a plant area over the span of a number of days. What they were missing in their morning THAs was an explicit conversation about changes in the plant operating conditions or the working conditions (weather, a different stage of the task which lead to different hazards associated with that day’s task). When we addressed this gap explicitly, there was a reduction in events that I could relate to this process. However, without continued reminder about the importance of this fundamental process, behavior tended to revert to the complacent approach previously observed. So, the “jury is still out” regarding whether this ‘proactive reporting’is going to make a difference.

    I’m reminded of my research into the “Human Factor” revealing that we are pretty awful when it comes to ‘vigilence’ when the work is routine. Some successful companies say that Safety is as much about thinking as it is about anything else. So, the question seems to be about how we keep people thinking, engaged and aware.

    I hope the thoughts are helpful.

  7. Mark Says:

    Thanks for your comment.

    Yes - I agree with you and Mike about categorizing and trending (I need to post my talk about this sometime soon).

    I just think that people try to do more with “Apparent ‘CAUSE’ Analysis.” Things like corrective actions and cause determination for trending.

    I think you should trend:

    - incident location
    - incident time
    - workgroup
    - equipment ijnvolved
    - incident type (for example - slip on wet surface or part failure)

    That’s a small list - and I invite others to add to it. But it is not an Apparent Cause Analysis or trending of Apparent Causes.

    I think even the term “Apparent Cause” is deceiving. Why don’t we call it … “Jump to Conclusion” Analysis. This would nore properly describe what is most commonly practiced by junior engineers overwhelmed with other work.

  8. Joe Anastasio Says:

    At my plant I have had to write an “apparent cause analysis” for an apparent cause analysis that did not prevent a repeat event. I certaily wish that I had found this discussion prior to that event!

Leave a Reply