Ever have your multi-million dollar, revenue generating website go down? Yep, happened to me too. A couple of times. Fortunately, I wasn’t the one who caused the problem (unemployment office here I come) – but I was a stakeholder, and once the site was up and running I made sure to invoke the request for an RCA (root cause analysis for all you rookies out there). And no, “technical issues” is not an analysis.
Now, I’ve been on the giving end (providing an RCA) and I’ve been on the receiving end (receiving… an… RCA). One of the best things this analysis does is help you understand what happened, why it happened and how it will be prevented in the future. An almost forced exercise in continuous improvement, but so very valuable.
I recently read an article on Mindtools.com that I found super helpful in helping identify the root cause of a problem. It’s called the “5 Whys” – created by the baby daddy of Kaizen, the Toyota Way, Lean Manufacturing Sakichi Toyoda back in the 1930s. The whole concept of the 5 Whys – according to Mindtools – is that they are counter measures, not solutions which will (hopefully) prevent the problem from happening again.
Here’s how you do it:
- Define the problem (The website went down from 11am to 6pm)
- “Bad code was moved into the production environment during our release this morning”
- “It wasn’t caught during testing in our staging environment”
- “The bad code that was moved wasn’t in our code review”
- “That section of the site wasn’t in our inventory”
- “There was a last minute scope change to that section of the site and it wasn’t documented”
There you have it. Scope creep ruined your day – again.
With this information though there are five different pieces of data that can be used to improve the process next time.
I’m certainly going to give this one a go the next time we have some technical issues. I recommend you do as well!
Oh, and for those playing at home, I thought I’d share my template for an RCA communication (I used to have to do these all the time – so here you go world).
Feel free to use it the next time you need to provide an RCA.
- Issue: A brief description of the issue in totality (the issue may have changed from when it was first reported to when it was resolved).
- Time: The time the issue is suspected to have begun.
- Severity: Including details of what’s affected and the breadth of the issue.
- Resolution: This should state how and when it was resolved.
- Root Cause: Ask your “5 Whys” and provide the root cause here.
- Lessons Learned: How is the process going to change to ensure this won’t happen again, as well as any actions and next steps.