So you want to be a scribe? You've come to the right place! You don't need to be a senior team member to become a deputy or scribe, anyone can do it providing you have the requisite knowledge!
Credit: Holly Chaffin
The purpose of the Scribe is to maintain a timeline of key events during an incident. Documenting actions, and keeping track of any followup items that will need to be addressed.
It's important for the rest of the command staff to be able to focus on the problem at hand, rather than worrying about documenting the steps.
Your job as Scribe is to listen to the call and to watch the incident Slack room, keeping track of context and actions that need to be performed, documenting these in Slack as you go. You should not be performing any remediations, checking graphs, or investigating logs. Those tasks will be delegated to the subject matter experts (SME's) by the Incident Commander.
Before you can be a Scribe, it is expected that you meet the following criteria. Don't worry if you don't meet them all yet, you can still continue with training!
- Excellent verbal and written communication skills.
Read up on our Different Roles for Incidents to see what is expected from a Scribe, as well as what we expect from the other roles you'll be interacting with.
There is no formal training process for this role, reading this page should be sufficient for most tasks. Here's a list of things you can do to train though,
-
Read the rest of this page, particularly the sections below.
-
Participate in Failure Friday (FF).
- Shadow a FF to see how it's run.
- Be the scribe for multiple FF's.
Scribing is more art than science. The objective is to keep an accurate record of important events that occurred on the call, so that we can look back at the timeline to see what happened. But what exactly is important? There's no overwhelming answer, and it really comes down the judgement and experience. But here are some general things you most definitely want to capture as scribe.
- The result of any polling decisions.
- ✘ This is not "9 people voted yay, 3 voted nay".
- ✓ It is "Polled for if we should do rolling restart. <USER_A> is proceeding with restart."
- Any followup items that are called out as "We should do this..", "Why didn't this?..", etc.
- ✘ This is not "Why isn't the Support representative on the call?"
- ✓ This is "TODO: Why didn't we get paged for this earlier?"
The Steps for Scribe provide a detailed description of what you should be doing during an incident.
Here are some examples of phrases and patterns you should use during incident calls.
At the start of any major incident call, you should start our status stalking bot, so that it will post to the room an update automatically.
!status stalk
This will provide the update and allow the IC to see the status without having to keep asking.
During a call, you will hear lots of discussion happening, you should not be documenting all of this in the chat room. You only want to document things which will be important for the final timeline. It's not always obvious what this might be, and it's usually a matter of judgement. You generally want to note any actions the IC has asked someone to perform, along with the result of any polling decisions.
Polled for decision on whether to perform rolling restart. We are proceeding with restart. [USER_A] to execute.
Some actions might seem important at the time, but end up not being. That's OK. It's better to have more info than not enough, but don't go overboard.
Sometimes during the call, someone will either mention something we "should fix", or the IC will specifically ask you to note a followup item. You can do this in Slack by simply prefixing with "TODO", this will make it easier to search for later.
TODO: Why did we not get paged for the fall in traffic on [X] cluster?
The post-mortem owner will find these after and raise tasks for them.
When the IC ends the call, you should post a message into Slack to let everyone know the call is over, and that they should continue discussion elsewhere.
Call is over, thanks everyone. Follow up in Slack.
Don't forget to also stop the status stalking.
!status unstalk