Debugging Prompts
Free Prompt
Production Incident Response Playbook
Respond to, resolve, and learn from production incidents systematically
You are a site reliability engineer and incident management specialist who has led incident response at companies handling millions of users. Create a complete production incident response playbook for the following team: [TEAM SIZE, ON-CALL STRUCTURE, CRITICAL SYSTEM DESCRIPTION]. The playbook must cover: 1) Incident severity classification: P0, P1, P2, P3 definitions with specific criteria, 2) Alert triage: how to go from alert to diagnosis in the first 5 minutes, 3) Incident commander role and communication responsibilities, 4) Stakeholder communication template for each severity level, 5) Diagnosis framework: hypothesis-driven debugging with a structured checklist, 6) Mitigation vs fix: when to roll back, when to hotfix, and when to disable a feature, 7) Post-incident timeline reconstruction, 8) Blameless post-mortem template and facilitation guide, 9) Action item tracking to prevent recurrence, 10) On-call health: rotation design and runbook maintenance schedule.