DevOps Dojo: Infrastructure Monitoring and Stability
Yesterday I was invited to share my thoughts on infrastructure monitoring and stability in an Atlassian Open DevOps Dojo hangout. It was great to be able to join infrastructure gurus Roy Rappaport (Netflix), Jeff Behl (Logic Monitor), and Mark Breitung (Atlassian) for some data-nerd-on-data-nerd conversation.
We discussed the hazy difference between infrastructure and application monitoring, the increasing importance of anomaly detection, dashboards, war stories and even a little DevOps culture chat. With the topic of the open dojo out of the way we convinced Roy to tell us all about Chaos Monkey and the rest of the simian army. It's super cool what those crazy Netflix folks have running in production!
Give it a watch and let me know what you think. Did we miss something important? Have a question? Something to add? Leave a comment and join the conversation!
Also, a big high-five to Sarah Goff-Dupont for organizing the hangout!