July 1, 2020 | Mark Paradies

Best Root Cause Analysis for Equipment Problems [Review]

troubleshooting an equipment problem

What is the Best Root Cause Analysis for Equipment Problems?

Before you can review a root cause analysis system to use for equipment problems, you need to decide what makes the best root cause analysis system to meet your goals.

In this article we will discuss:

  • goals for an equipment root cause analysis system
  • what makes a root cause analysis system work for equipment failure investigations
  • recommended root cause analysis systems (and what doesn’t work)

This will help you plan to implement the best root cause analysis to improve your company’s equipment reliability.

What Do You Need for Root Cause Analysis of Equipment Problems?

troubleshooting a gearbox failure

To set your goals for improving your equipment problem root cause analysis you must understand why people have problems finding the root causes of equipment problems. In this article, we will cover the two major issues we have seen.

FIRST, we see people trying to find root causes BEFORE they understand the problem. In other words, they need to complete effective equipment failure troubleshooting BEFORE they start their root cause analysis.

If you don’t understand the failure – what happened to cause the equipment failure – you can’t understand why the failure occurred (the root causes).

We saw this problem back in the late 1990s when we tried teaching root cause analysis without teaching effective, advanced troubleshooting techniques (more about that later).

SECOND, many times we see people blame the failure on the equipment when there really was a human performance cause. Why do they do this? Because they don’t want one of their coworkers blamed for the failure.

A majority of equipment failures are caused by:

  • improper installation
  • failure to maintain the equipment
  • improper maintenance
  • using the equipment for something it wasn’t designed to do
  • improper operation of the equipment

These are all human performance problems.

Here’s the catch … almost all the people investigating equipment failures have ZERO training on finding the root causes of human performance issues (human error). Therefore, they jump to the conclusion that someone must be to blame for the error. Then the common, ineffective corrective action is to:

  • Counsel the mechanic/operator to be more careful, or
  • Retrain the mechanic/operator, or
  • Write a procedure for the mechanic/operator (or make the procedure longer).

Do you frequently see these corrective actions but the equipment failures continue to repeat?

Therefore, there are two goals you must have to improve your equipment failure root cause analysis:

  1. You need effective TROUBLESHOOTING.
  2. You need a root cause analysis technique that helps people find the ROOT CAUSES of HUMAN ERRORS.

That’s what we will explain in the next two sections.

Improve Your Troubleshooting

Back in the mid-1990s, we had people tell us that advanced root cause analysis didn’t work for equipment problems. We discovered that these folks couldn’t find the root causes of the failure because they didn’t understand how the failure happened. They wanted to jump to root cause analysis BEFORE they understood what happened. But jumping to conclusions didn’t help them effectively solve the problem.

So what did we do? We contacted one of the world’s most knowledgable experts in equipment reliability and troubleshooting – Heinz Bloch.

Heinz developed his knowledge of equipment failures and troubleshooting while working at Exxon. He became their “go-to” guy for solving “unsolvable” problems at facilities around the world. When he retired, he wrote over 30 books on equipment reliability and equipment troubleshooting topics.

It was obvious to us that we didn’t want to reinvent the wheel. Instead, we worked with Heinz and licensed his advanced equipment troubleshooting tools so we could include them in our root cause analysis system (we will talk about that later).

What did this do? It gave people in the field a handy checklist to use to effectively troubleshoot equipment failures based on the symptoms. This provided Heinz’s knowledge for all field personnel and provided much better ideas for the “what happened” portion of understanding an equipment failure.

What if Heinz didn’t have a particular type of equipment in his troubleshooting tables? We worked to provide two other options to make the troubleshooting work.

First, we incorporated two of Heinz’s advanced techniques into our system. Failure Modes and Failure Agents Analyses. These can help troubleshoot problems that aren’t covered in the troubleshooting tables.

Second, the software allows you to add troubleshooting tables for your custom equipment. You use the knowledge of your experts, consultants, or vendors to develop custom troubleshooting tables and add them to the software. Then your folks can use them whenever a problem occurs. In this way, the knowledge of your experts isn’t lost when they retire.

We will show you more information about these troubleshooting tools below.

Improve Your Root Cause Analysis

Now that you have effective troubleshooting, you are ready to find the equipment problem’s root causes and fix them with effective corrective actions.

What do you need to find the root cause of human performance issues? You need guidance to your investigators that will help them dig deeper into human errors and find their root causes. This means an expert system tath helps investigators find causes that they previously would have overlooked.

Where can you find a root cause analysis system with built-in human factors troubleshooting tools? Keep reading to find out.

The Root Cause Analysis System with the Tools You Need

OK – I’m going to share the secret … the TapRooT® Root Cause Analysis System is the advanced root cause analysis tool that has both Heinz Bloch’s advanced equipment troubleshooting tools and a built-in expert system to guide investigators to the fixable root causes of human errors. We will provide an overview of these tools below.

Equifactor® Equipment Troubleshooting and TapRooT® ® Root Cause Analysis Book

FIRST, Heinz Bloch’s troubleshooting tools cover the equipment troubleshooting guidance that people need. The Equifactor® Equipment Troubleshooting Tools are described in a book, Using Equifactor® Troubleshooting Tools and TapRooT® Root Cause Analysis to Improve Equipment Reliability. They are included in the TapRooT® Software. And they are taught in two Equifactor® Courses. More about each of these below.

The TapRooT® Process for Equipment Troubleshooting and Root Cause Analysis

The diagram above details the process and tools for troubleshooting an equipment failure, identifying the failures Causal Factors, finding the Causal Factors’ root causes, and developing effective corrective actions.

SnapCharT®

Once you have a preliminary SnapCharT® (example above), you are ready to start troubleshooting using the Equifactor® Troubleshooting Tables and Tools shown in the guide below…

Equifactor® Troubleshooting Tables

You can look at the Equifactor® Troubleshooting Guide above and get an idea of the types of equipment that are included in the troubleshooting tables.

The information gained during troubleshooting is added to the SnapCharT® to provide a complete understanding of what happened that leads to the identification of the equipment failure’s Causal Factors.

Some equipment failures are serious enough that you will want to perform a root cause analysis. Some are trivial, and a simple replacement of the part or equipment is all that is needed. That is why there is a step in the process to decide if you will choose to do a complete root cause analysis or stop after troubleshooting and just replace the part (or equipment).

The decision is based on your judgment. You need to decide if there is something of value to learn by expending the effort needed to find and fix the equipment failure’s root causes.

If you proceed to identify the equipment failure’s Causal Factors, you will use the Equipment Failure Causal factor Worksheet (shown below).

Causal Factor Worksheet

Copyright © 2019 by System Improvements. Duplication Prohibited.

By this point in the process, you have developed an “Incident” (the circle on your SnapCharT® Diagram). The Incident is the worst thing that happened during the sequence of events. The Incident is usually the consequence of the equipment failure. For example:

  • Loss of production
  • More extensive equipment damage
  • Environmental release
  • An explosion or fire
  • An injury or fatality

Thus, Causal Factors may include actions that go beyond simple equipment failure and usually will include issues related to human errors. That is why TapRooT® Root Cause Analysis goes beyond the Equifactor® Troubleshooting System and includes the essential TapRooT® System Techniques that help the investigator analyze human errors.

If you are familiar with using the TapRooT® Root Cause Analysis System and the Root Cause Tree® diagram for finding the root causes of safety and quality issues, you know that the next step in the TapRooT® Process is finding the specific root causes for each Causal Factor. For this, you use the TapRooT® Root Cause Tree® and Root Cause Tree® Dictionary.

The Root Cause Tree® is a systematic analysis tool to guide you to the root causes of human performance and equipment-related root causes. If the Causal factor is related to a human performance difficulty (a human error), you will use the Human Performance Troubleshooting Guide to find areas to explore in more detail. A piece of the guide is shown below…

Human Performance Troubleshooting Guide

The Human Performance Troubleshooting Guide (15 questions) leads you to one or more of the seven Basic Cause Categories. The seven Basic Cause Categories are:

  • Procedures,
  • Training,
  • Quality Control,
  • Communications,
  • Human Engineering,
  • Work Direction, and
  • Management Systems.

If a category is indicated by the guide, the investigator uses evidence in a process of elimination and selection guided by the questions in the Root Cause Tree® Dictionary.

The investigator uses evidence to work their way down the tree until root causes are discovered under the indicated categories or until that category is eliminated. Here’s the Human Engineering Basic Cause Category with one root cause (Lights NI) indicated.

Each Causal Factor can have one or more root causes.

The process of using the Root Cause Tree® was tested by users in many different industries including a refinery, an oil exploration division of a major oil company, the Nuclear Regulatory Commission, hospitals, and an airline. In each case, the tests proved that the Root Cause Tree® helped investigators find root causes that they previously would have overlooked and improved the company’s development of more effective corrective actions. You can see examples of the results of performance improvement by using the TapRooT® System by seeing the Solutions – By Industries tabs above (each industry has a different success story).

Finally, you need to develop corrective actions for the root causes you have identified. You do this using the Corrective Action Helper® Guide or Corrective Action Helper® Module of the TapRooT® Software.

Typical guidance provided in the Corrective Action Helper® Guide is shown below for one root cause.

You may need to address the root causes of the failure and implement the fixes before you repair the equipment. Why? Because you don’t want to cause a repeat failure by damaging the equipment while you repair it.

This may seem like a detailed, complex process but remember … it is guided by the software and it is really quite simple once you have been through training. Also, remember … you decided that the effort was worth it early in the investigation. In many cases, the troubleshooting will be all you do. Full root cause analysis is performed only if you decide that the effort and corrective actions to prevent recurrence will be worthwhile.

And that’s the point. You only expend the effort you need depending on the value the effort will produce.

What’s Next?

If your company is new to TapRooT® Root Cause Analysis and you are considering it to help your company improve equipment reliability, we recommend that you send a maintenance manager, a reliability engineer, or one of your top equipment troubleshooters to one of our public 2-Day Equifactor® Equipment Troubleshooting & TapRooT® Root Cause Analysis Courses. See the course outline at:

[Registration is closed.]

Don’t worry. They will find this training valuable. How do we know? Because we guarantee our training. Here is our Equifactor® Training guarantee…

Attend this course, go back to work, and use what you have learned to analyze accidents, incidents, near-misses, equipment failures, operating issues, or quality problems. If you don’t find root causes that you previously would have overlooked and if you and your management don’t agree that the corrective actions that you recommend are much more effective, just return your course materials and we will refund the entire course fee.

Because we know that you will find the Equifactor® Training valuable, we know that you will want to hold more training for your staff at your site. The only decision you will need to make is if you will hold the full 2-Day Equifactor® Course for everyone or, to save time, you will hold just the 1-Day Equifactor® Equipment Troubleshooting Training for your field personnel.

The skils that people learn will be directly applicable to the troubleshooting and root cause analysis they perform in the field.

What else? You will want to implement the TapRooT® Software (that includes computerized Equifactor® Troubleshooting Tables).

Your people will want to use these tables to develop troubleshooting plans and perform their root cause analysis.

Don’t wait! You can’t afford more equipment failures and the resulting plant downtime. You need to use Equifactor® Troubleshooting and TapRoot® Root Cause Analysis to improve your equipment reliability and start saving time and money. Contact us to help you understand the potent return on your investment.

GET STARTED TODAY!

Categories
Equipment Reliability / Equifactor®, Human Performance, Interviewing & Evidence Collection, Root Cause Analysis
-->
Show Comments

2 Replies to “Best Root Cause Analysis for Equipment Problems [Review]”

  • Andrew Gwie says:

    Hello. Our chemical plants have various types of hazardous processes, equipment and instrumentation. And we have lots of new construction activities occurring at our site(s).

    Although we have training in TapRoot, our company refuses to further invest in Equifactor training and materials.

    Does that consequently mean we can not progress far with the investigation of equipment and instrumentation issues & failures other than reviewing that preventive maintenance was performed appropriately?

    • Mark Paradies says:

      Here are a couple of ideas.

      1. Order the Equifactor® Book. See this link: https://store.taproot.com/book-5-using-equifactor-troubleshooting-tools-and-taproot-root-cause-analysis-to-improve-equipment-reliability

      Reading a book isn’t the most efficient learning, but it will get you started using Equifactor®. And if you already have the TapRoot® VI Software, you will be on the right track.

      2. Try to get approval to attend the next TapRooT® Summit and take the public Equifactor® Course as part of the package.

      3. If you don’t have Equifactor®, you will have to rely on the individual’s skills in troubleshooting equipment and I&C problems.

      You can try to collect some statistics on repeat equipment problems to show management the value of improving troubleshooting.

      Best Regards,

      Mark

Leave a Reply

Your email address will not be published. Required fields are marked *