×

 

Published by 
Noria Corporation
  •  Subscribe
Share
Tweet
Magazine: December 2022

Root Cause Analysis: What's the Point?

Brian Hughes

Root Cause Analysis: What's the Point?

RCA: Current State

Maintenance and reliability people fix failed equipment regularly, and when the impact of this failure is substantial, a root cause analysis (RCA) is often required. But what’s the point? Once we’ve fixed the issue, shouldn’t we leave well enough alone and get on with our lives? We’ve got enough going on. Why do we need to take multiple people away from their jobs to spend several hours analyzing something that’s in the past?

A root cause analysis (or any analysis) requires an investment of our scarcest resource: time. Why should completing an RCA be how we spend our precious time? The answer to this question might seem obvious, but it’s always worth asking.

We invest in RCAs because, as with any investment, we believe they will yield positive returns. But that’s not always the mindset of those performing these analyses. Many times, we do an RCA because it’s mandated by leadership. Those investigating don’t believe it’s worth the time, but it must be done because, you know, those are the rules. So, the standard becomes the “minimum viable product” that gets the boss to sign off that the RCA was completed.

This is not the way to run an effective RCA program. RCA done this way achieves very few results and can do more harm than good. It becomes a tax on time, resulting in little added value and a considerable amount of negative feelings. But it doesn’t have to be this way. Having a trusted process and set of tools is necessary for developing a successful RCA process. But we first need to establish why anyone would want or need to invest their time and money in RCA.

Let’s talk for a minute about our current reality. At this moment, we are on an elevator of technological progress, exponentially accelerating upward. Take the progress of the wheel, for example; the wheel was first invented around 5,200 B.C., and it was likely used for pottery, not carts. It took nearly a thousand years for the wheel to be used for transportation, and it wasn’t until the 19th century that we started seeing powered vehicles. Now, as we approach the end of the first quarter of the 21st century, we find that cars can almost drive themselves.

Most of the advancements in technology happened within the last 150 years or so. If the 7,000 years since the first wheel equated to one hour, nearly all the serious progress would have occurred in the final minute. And that progress continues to accelerate.


Figure 1. Vehicle Complexity Over Time - Click to Enlarge

So, what does that mean? Problems, and lots of them. Our drive to advance technology is spawning more problems of greater complexity that require solutions in a shorter amount of time. Of course, people in different industries will experience this phenomenon to different degrees; it’s not the same for everyone. But it is happening universally, and if we don’t find a way to become great at solving these problems quickly, there may come a time in our not-so-distant future when these problems overwhelm us.

So, what’s the point behind root cause analysis? RCA processes and tools allow us to solve difficult or “wicked” problems better and faster. These problems are often bigger than any single person. None of us has a monopoly on knowledge. We need to bring others into the mix to overcome gaps in what we think we know. At its best, RCA allows groups of diverse experts to quickly learn from each other to explain how the problem happened and what should be done to prevent future incidents.

We need to develop cultures that learn faster than machines fail. To do this, we need root cause analysis that is the right size (scalable) for the problem at hand, one that works — delivering consistent value given the time invested. When done correctly, those performing RCA can recognize the value in the process and not perform it simply because the boss asked for it.

Root Cause Analysis — Five Steps

There are several variations of root cause analysis, and typically all involve some mixture of the following five steps:

  1. Gather evidence and data to be used to draw and support conclusions.
  2. Document important problem information, such as what the specific problem is, when it happened, where it happened and what the impact of the problem is.
  3. Identify the causes of the problem. What happened that led up to the problem?
  4. Determine what will be done to solve the problem.
  5. Share what was learned with others.

Gathering Evidence and Data

Imagine you’re cooking dinner for a large group, and you want it to be terrific. Every chef knows that buying the best ingredients is the first step to a great meal. Evidence and data are the “ingredients” for an RCA. Usually, when a failure occurs, all energy is directed toward bringing the asset back online. In the rush to recover, data and evidence can wind up being discarded or destroyed. This is always a mistake. We need to gather broken parts and equipment, samples of fluids, system data, documentation and witness or expert statements as soon as possible to facilitate future learning.

You can find evidence from a variety of different sources:

People: Witnesses, operators, maintenance technicians, design engineers, safety engineers, OEM representatives and outside experts all are excellent sources of information. Remember, no single source knows everything about the problem, so it’s crucial to diversify.

Procedures and Documentation: Look for documented evidence. This includes:

  • Preventative and predictive maintenance task records
  • Maintenance, operation and installation manuals
  • Operations and maintenance reports
  • Design diagrams
  • Piping, instrumentation and electrical diagrams
  • Documentation of past failures

Photos, Video, Audio: Many facilities are under 24-hour video surveillance. If available, try to get these videos or photos, or tour the scene and take photos and video yourself, making sure to adhere to all organizational guidelines. If audio files exist, such as from two-way radio communication, these can also be good sources of information. While online videos can be helpful resources, don’t become reliant upon them, and always make sure they’re from a reputable source.

Hardware, Software, Systems: Different systems can offer unique insights. To discover the information, ask discovery questions such as:

  • What equipment failed? Was it a specific component or multiple?
  • What was the design intent?
  • How does each component fit into the overall system?
  • How was it being operated compared to how it was designed to operate?

Environment: Environmental causes are also important considerations. Ask yourself:

  • Was it hot or cold? Wet or dry?
  • Was it inside or outside?
  • What was the business environment at the time? Was it seasonally busy or slow?

Gathering evidence from diverse sources as soon as possible after the event is one of the most important parts of a high-quality RCA.

State the Problem

The problem needs to be accurately stated, and key information should be documented. When developing a problem statement for a reliability issue, it’s helpful to use the following formula:

“Asset ABC Unavailable + XX Hours to Recover”

When analyzing the causes, writing the problem statement using this formula allows us to include the story of why and how the asset experienced downtime, as well as how much time was required to bring the asset back online.

We also need to document the time and date, where the problem happened and the actual and potential impacts of the problem. It’s particularly important to document both actual and potential impacts because leaders need to know how bad the problem was as well as how bad it could have been. Finally, it’s useful to include how often this type of problem has happened in the past.

Analyze the Causes

There are several ways of analyzing causes. But what’s the “right” way? The truth is, there is no one right way, only the way that works best for you. "All models are wrong, but some are useful" is a saying attributed to George Box, and it’s appropriate here. Ask yourself, does your cause-and-effect analytical model:

  • Help you manage input from the group?
  • Help you tell the complete story of the event?
  • Accomplish these things in a way that doesn’t add to the burden of the analysis?

If so, then your model is useful.

At Sologic, we like to create a model by starting with the problem and then working backward in time, identifying the cause-and-effect relationships that led up to the problem. It’s like playing a movie backward, frame by frame. When you analyze causes in this way, the group can clearly see how they resulted in the problem.

Model templates can be extremely helpful. For instance, the template below works for most reliability issues.


Figure 2. Reliability Problem-Solving Chart - Click to Enlarge

This template starts with the formula described in the “State the Problem” section, which is Asset Unavailable + XX Hours to Recover. The top branch then prompts the investigation team to discover the story of the fault and what brought the asset down. The bottom branch asks them to account for the hours required to achieve recovery. Some of that time was used to make the system safe, some of it was due to diagnosing and the rest was used to repair the problem. This template can be scaled to fit the complexity and severity of the problem to help ensure we don’t waste time over-investigating.

Note that a method like this won’t lead the team to a single root cause. In fact, the farther back in time you go, the more the branches will diverge from each other. It’s important to remember that there is no single root cause for any given event. Therefore, searching for one is futile. What’s more important is to understand how the causes work in conjunction with each other to result in the problem.

Solve the Problem

Solutions control causes. When you use a causal model like the one above, you can easily identify solutions that control individual causes. It’s already been mentioned that there are no single root causes for any event. This fact is liberating in that it frees the team to identify any number of solutions that control the causes identified in the model. Ultimately, a diversified “basket” of solutions is desired.

Report Findings

Once the team has gathered evidence, thoroughly defined the problem, analyzed its causes and identified solutions, the final step is to share what was learned. The investigation team knows more about the problem than anyone else. Therefore, they are in the best position to help teach the rest of the organization. The best way to do this is by creating a thorough and thoughtful incident report. An incident report doesn’t need to be long — it just needs to tell the story in a way that helps others learn.

Putting it all Together

Technological advances are spawning an ever-greater number of complex problems. Success in such a world requires that we employ organizational learning techniques, including root cause analysis, in ways that help us leverage the diverse knowledge and brainpower at our disposal. Ultimately, success results in an organization that learns faster than it fails.

Subscribe to Machinery Lubrication

About the Author
Brian Hughes

Brian Hughes is President of Sologic, a leading provider of structured problem-solving methods, training, and tools including Causelink software.<...

Related Articles
Single Point Lesson: Equipment Criticality Analysis
RC-Yay! Finding Success with Root Cause Analysis
Root Cause Assessment Methods
Fishbone Diagram: Determining Cause and Effect
Featured Whitepapers
Good, Better, Best Approach to Electric Motor Testing
Why You Should Expand Your Condition Monitoring Program Beyond Vibration and Temperature Monitoring
Buyer's Guide
Lubricants
Oil Filtration
Lubricant Storage and Handling

玻璃钢生产厂家河南玻璃钢花盆定做珠海玻璃钢景观雕塑款式多样天津户内玻璃钢雕塑商场美陈布置属于什么设计玻璃钢香蕉雕塑供应商褐色玻璃钢花盆热卖玻璃钢花盆花器贵州景区玻璃钢雕塑销售电话棒的玻璃钢花盆三明玻璃钢玩偶雕塑邢台玻璃钢广场雕塑定制园林玻璃钢雕塑低价批发园林玻璃钢雕塑多少钱玻璃钢雕塑的树脂比例玻璃钢红军雕塑哪家正规甘南玻璃钢卡通雕塑龙井玻璃钢花盆制作玻璃钢雕塑玩偶常熟商场入口美陈德州太湖石玻璃钢彩绘雕塑厂家商场餐饮美陈商场楼层气球造型美陈吉林环保玻璃钢雕塑销售厂家番禺动物雕塑玻璃钢图片重庆商城玻璃钢造型雕塑制作云南玻璃钢雕塑方案户外景观玻璃钢雕塑哪家好江苏季节性商场美陈市场报价重庆玻璃钢雕塑哪里有绍兴常用玻璃钢花盆香港通过《维护国家安全条例》两大学生合买彩票中奖一人不认账让美丽中国“从细节出发”19岁小伙救下5人后溺亡 多方发声单亲妈妈陷入热恋 14岁儿子报警汪小菲曝离婚始末遭遇山火的松茸之乡雅江山火三名扑火人员牺牲系谣言何赛飞追着代拍打萧美琴窜访捷克 外交部回应卫健委通报少年有偿捐血浆16次猝死手机成瘾是影响睡眠质量重要因素高校汽车撞人致3死16伤 司机系学生315晚会后胖东来又人满为患了小米汽车超级工厂正式揭幕中国拥有亿元资产的家庭达13.3万户周杰伦一审败诉网易男孩8年未见母亲被告知被遗忘许家印被限制高消费饲养员用铁锨驱打大熊猫被辞退男子被猫抓伤后确诊“猫抓病”特朗普无法缴纳4.54亿美元罚金倪萍分享减重40斤方法联合利华开始重组张家界的山上“长”满了韩国人?张立群任西安交通大学校长杨倩无缘巴黎奥运“重生之我在北大当嫡校长”黑马情侣提车了专访95后高颜值猪保姆考生莫言也上北大硕士复试名单了网友洛杉矶偶遇贾玲专家建议不必谈骨泥色变沉迷短剧的人就像掉进了杀猪盘奥巴马现身唐宁街 黑色着装引猜测七年后宇文玥被薅头发捞上岸事业单位女子向同事水杯投不明物质凯特王妃现身!外出购物视频曝光河南驻马店通报西平中学跳楼事件王树国卸任西安交大校长 师生送别恒大被罚41.75亿到底怎么缴男子被流浪猫绊倒 投喂者赔24万房客欠租失踪 房东直发愁西双版纳热带植物园回应蜉蝣大爆发钱人豪晒法院裁定实锤抄袭外国人感慨凌晨的中国很安全胖东来员工每周单休无小长假白宫:哈马斯三号人物被杀测试车高速逃费 小米:已补缴老人退休金被冒领16年 金额超20万

玻璃钢生产厂家 XML地图 TXT地图 虚拟主机 SEO 网站制作 网站优化