Monitoring, managing and troubleshooting large scale networks
Came across this excellent presentation of Peter Hoose (Facebook). It gives a very good logical way of troubleshooting problems. Less about actual problems but about how Ops members companies like Facebook troubleshoot them.
This is from NANOG 64. Enjoy the presentation. :)