by josquindesprez   2017-09-11
If you haven't seen it already, start with Feynman's appendix to the report on the Challenger disaster: (https://science.ksc.nasa.gov/shuttle/missions/51-l/docs/roge...). Pretty much any of the reports put out by the NTSB would fit the bill. They do a lot of what I like to call 'full-stack debugging' of engineering (both of products and culture), where the stack spans all the way from material science to management science. It's rarely the case that there's a simple mechanical engineering explanation for the big failures that they investigate. It's the confluence of multiple engineering and cultural failures that causes catastrophes.

For a better look at the politics of investigations, try this article on the Columbia disaster, which was on the front page here earlier this week: (https://www.theatlantic.com/magazine/archive/2003/11/columbi...).

Alternatively, since you mentioned the blackout, there's a lot to learn from analyzing the government response to Katrina, say, something like (https://www.amazon.com/Disaster-Hurricane-Katrina-Homeland-S...).