Root Cause Analysis
Systematically trace a problem back to its root cause instead of treating symptoms.
What it does
When something goes wrong — a production incident, a missed deadline, a recurring customer complaint — this prompt walks you through a structured root cause analysis. It goes beyond “what happened” to uncover why it happened and what systemic fix prevents recurrence. Works for technical incidents, process failures, and organizational problems.
The Prompt
I need to find the root cause of a problem.
Problem: [DESCRIBE WHAT WENT WRONG]
Timeline of events:
- [WHEN DID IT START / WHEN WAS IT NOTICED]
- [WHAT HAPPENED IN SEQUENCE]
- [WHEN/HOW WAS IT RESOLVED, IF RESOLVED]
Impact: [WHO/WHAT WAS AFFECTED AND HOW SEVERELY]
What we think caused it: [CURRENT THEORY, IF ANY — it's fine to say "no idea"]
Please analyze as follows:
1. FIVE WHYS: Starting from the visible symptom, ask "why?" five times. At each level, identify the mechanism — don't just restate the problem in different words.
2. CONTRIBUTING FACTORS: What conditions had to be true for this to happen? Separate into:
- Direct cause (the thing that broke)
- Enabling conditions (things that allowed the break to have impact)
- Absent safeguards (things that should have caught it but didn't)
3. SYSTEMIC vs. PROXIMATE: Is the root cause a one-time event (proximate) or a pattern/process gap (systemic)? How do you know?
4. FIX RECOMMENDATIONS: For each level of cause, suggest a fix. Rank by:
- Effectiveness (does it actually prevent recurrence?)
- Effort (quick win vs. major project)
- Blast radius (does the fix improve other things too, or is it narrow?)
5. VERIFICATION: How would you confirm the root cause is correct? What evidence would prove your analysis wrong?
Be direct. If my timeline has gaps that make root cause uncertain, tell me what information is missing rather than guessing.
Usage Notes
- The “Five Whys” works best when each level has a concrete mechanism. “Why did the server crash?” → “Because memory exceeded 16GB” is good. “Because the system failed” is restating the symptom.
- For production incidents, paste relevant log snippets or error messages into the timeline. The more concrete the evidence, the better the analysis.
- The “Verification” step is often skipped in real post-mortems but it’s the most important. A root cause analysis that can’t be falsified is just a plausible story.
- For recurring problems, run this prompt on 3 separate occurrences and look for common contributing factors across all three.