Home Latest

latest / The 3am deploy and other things I no longer do

Early in my career I was very proud of being the person who would deploy at 3am to fix the thing. I had a hoodie. I had a personal mug at the office. I had, in retrospect, an unexamined belief that visible suffering was the same as competence.

It is not. It is mostly just suffering.

The 3am deploys did not, on inspection, correlate with the bugs being more important. They correlated with the bugs being more visible to me at 3am, which is a different and much less useful metric. The genuinely important incidents had a way of getting noticed in business hours, by more than one person, with everyone awake. The 3am ones were almost always me, alone, making things worse in a tired and creative way.

What I do now:

  • If the system is on fire, page the on-call. That’s the job. If I am the on-call, fine, but the bar is “users are actively losing money,” not “I noticed something weird.”
  • If the system is not on fire, write it down and look at it in the morning. Mornings have more brain in them.
  • Never, ever ship a fix without a test, regardless of how obvious the fix is. The 3am-me’s idea of “obvious” has a track record and it is not a good one.

This is also, I think, the under-told story of the last decade of ops: we built a lot of automation specifically so that fewer humans had to be heroes at 3am, and it worked, and we should let it work. The hoodie was cool but the on-call rotation is cooler.