Few months ago I tried to get my head around Post-Mortem documentations and found it particularly hard to fill the gap between the publicly available documentations and my aim to have company internal documentations which teams could use to share knowledge and learn from past mistakes. During my research I came across lot’s of publicly available information which helped me to dive into the topic. But unfortunately information was widely distributed and I though that sharing my link collection could help to shorten your way a bit.
Basics
Some good reads if you want to learn what Post-Mortems are:
Foundations
If you want to look further into the topic, you’ve to deal with human error and failure. These will give you some idea how large this topic is:
Instructions
Adding up on top of that, there are lot’s of blog-posts, interviews and descriptions on how post-mortems should be conducted:
Once you discovered all of that and you want to apply it in your team, there are even some tools available:
With all of those you get a great insight in what type of culture you should establish in your team and essentially this makes up a good internal documentation and brings up good input for public statements. Which kind of filled the gap for me.
Archives
Finally there’s a long list of links to existing documentations. First and foremost there’s a never ending list of post-mortem documentations on Tumblr: fuckyeahpostmortems.tumblr.com
Looking closer into these, there are some examples which span across some well known companies:
Other industries of course also have postmortem documentations or lessons learned which they share:
Offtopic
I’ve also learned that there are programming techniques which enable you to debug software fails on the algorithmic levels. E.g. for NodeJS or Python.