Blameless Incident Postmortem Generator
Turn incident notes into a structured, blameless postmortem with timeline, root cause chain, and concrete action items.
What it does
Takes your raw incident notes — Slack messages, on-call logs, timestamps, whatever you have — and produces a structured postmortem document. The key differentiator: it enforces blameless language, separates contributing factors from the root cause, and generates action items that are specific enough to actually track.
The Prompt
Generate a blameless incident postmortem from the following incident notes.
Incident notes:
[PASTE RAW NOTES — Slack messages, timeline entries, on-call logs, whatever you have. Messy is fine.]
Service/system affected:
[WHICH SYSTEMS WERE IMPACTED]
Impact duration:
[START TIME → DETECTION TIME → MITIGATION TIME → RESOLUTION TIME, or best estimates]
User impact:
[WHAT USERS EXPERIENCED — errors, latency, data issues, complete outage]
Structure the postmortem as follows:
## Summary
One paragraph: what happened, how long, what was impacted. Written so someone outside the team understands it.
## Timeline
Chronological table: Time | Event | Actor/System
Start from the triggering event (not from when on-call was paged). Include automated system responses (alerts firing, auto-scaling, circuit breakers) alongside human actions.
## Root Cause Chain
Do NOT write a single root cause. Write a CHAIN of contributing factors:
- Triggering event: The specific change or condition that initiated the incident
- Enabling condition: What made the system vulnerable to this trigger (missing guard, untested path, stale config)
- Propagation factor: Why the impact spread beyond the initial failure point
- Detection gap: Why it took [X minutes] to detect (if detection was slow)
## What Went Well
At least 2 things. Detection speed, team response, containment effectiveness, communication quality. Be specific.
## What Went Poorly
At least 2 things. Detection gaps, missing runbooks, unclear ownership, tooling gaps. Be specific.
## Action Items
For each action item:
- Action: Specific, completable task (not "improve monitoring")
- Type: PREVENT (stop recurrence) / DETECT (catch it faster) / MITIGATE (reduce blast radius)
- Owner: [TO BE ASSIGNED]
- Priority: P0 (this week) / P1 (this sprint) / P2 (this quarter)
Rules:
- NEVER use phrases like "X failed to" or "Y should have." Describe what happened, not who failed.
- Write action items as engineering tasks, not behavioral changes ("add circuit breaker on service X" not "be more careful with deployments").
- If something is unclear from the notes, flag it as "[UNCLEAR — verify with team]" rather than guessing.
Usage Notes
- Feed in raw, messy notes. The prompt is designed to handle unstructured input — you don’t need to clean it up first.
- The root cause chain format is more useful than a single root cause. Incidents almost never have one cause; they have a trigger that hits an enabling condition.
- The “never use ‘failed to’” rule produces genuinely blameless documents. Without it, AI defaults to finger-pointing language.
- Action items typed as PREVENT/DETECT/MITIGATE help you balance your reliability investment. If all your action items are PREVENT, you’re neglecting detection and mitigation.
- Run the output past the incident responders before publishing. The AI structures well but may misinterpret sequence or causality from ambiguous notes.