The things we find hardest in incident response>
Incident.io Blog – Chris Evans
Stephen Working out the most highly leveraged role to play Whenever Iâm leading larger incidents, I find it really hard to avoid the temptation to try and debug/fix the things, as opposed to assembling a team and running the process that fixes the thing. Because of my technical background, itâs always tempting to dive into the logs, look at the code, and see how I might fix things.
You might be qualified to do it, but thatâs not why youâre here â youâre here to lead the incident
Chris Getting up to speed without disrupting the flow More often than not, Iâd be pulled in after a number of people had already been involved and were deep in the weeds of a problem. Take a breath and accept that you canât hit the ground at 100mph
Pete Making decisions quickly as an individual vs context sharing and consensus In most cases, I have time to run those actions and assumptions past others, to make a more informed decision as a team and share context. In other cases, something is urgently wrong, and you find yourself forced into the less satisfying equivalent of âI canât explain why right now, but trust me this is the right thing to doâ. This can be understandably both concerning and frustrating for others involved, and is quite a blunt instrument – it can be hard to know when itâs the right call.
Keeping track of threads (virtual, not Slack) At the risk of sounding like Iâve primed you for a sale, incident.io does make a lot of this easier, but thereâs plenty more we can do to make this better!
Lisa Striking a balance between trusting your gut and systematically gathering evidence If you jump to conclusions too quickly, itâs possible to spend a long time down a rabbit hole before stepping back and realising that you donât really know that the problem is in system X â youâre here on a hunch. Equally, if you ignore your experience completely, youâre trying to solve a problem with one hand tied behind your back.
Lawrence Recovering from bad assumptions Once the team responding to an incident has established a âfactâ, theyâll be anchored by it, and all the conclusions that follow can be impacted. Try to avoid this. The best incident policy is to trust-but-verify- ask for proof, or better yet go back into your Slack logs (or whatever you use!) to confirm the evidence supports this statement. Good responders should be pushing as much evidence into the incident log as possible, which should make this easy to verify.
Link: https://incident.io/blog/the-things-we-find-hardest-in-incident-response