Over a million developers have joined DZone.

DevOps and More Better Analytics

DZone's Guide to

DevOps and More Better Analytics

· DevOps Zone
Free Resource

The DevOps Zone is brought to you in partnership with Sonatype Nexus. The Nexus Suite helps scale your DevOps delivery with continuous component intelligence integrated into development tools, including Eclipse, IntelliJ, Jenkins, Bamboo, SonarQube and more. Schedule a demo today

This post via Jenny Yang is by Toufic Boubez.

A t DevOpsDays Silicon Valley last week I gave a talk titled:  Beyond pretty charts, Analytics for the rest of us . The talk was inspired by an Open Space discussion that was started at DevOps Days Austin. It was a really good discussion, with a lot of interest, so I really wanted to dive deeper. Foolishly Patrick ( @patrickdebois ) and John ( @lusis ) gave me the opportunity to stand on my REST box (sorry) and wave my arms around in front of everyone. I think it went well  :)  .

The main premise of the talk and the Austin session was that current monitoring tools are clearly reaching the limit of their capabilities. That’s because they are making some fundamental assumptions that are no longer true. Mainly, they are assuming that the underlying systems being monitored are relatively static and therefore their behavioural limits can be defined and surrounded by static rules and thresholds. It has become clear however that we have moved beyond static monitoring and that interest in applying analytics and machine learning to detect anomalies in dynamic web environments is gaining steam. However, understanding which algorithms should be used to identify and predict anomalies accurately within all that data we generate is not so easy.

I spent time talking about the importance of knowing your data and its characteristics in order to be able to use the appropriate analytics techniques. For example, techniques such as the three-sigma rule or the Grubbs score (check out kale, the most excellent tool introduced by @abestanway at Etsy) are only meaningful if your data has a normal probability distribution. I also covered the importance of context, and described some simple data transformations that can give you powerful results, such as the simple act of looking at a histogram of your data, because the regular time series plots can only give you so much insight.

I ended up straying a bit from my original topic, mostly due to a really good Open Space the day before about self-healing systems. I’d given a talk about control theory at the Bay LISA meetup a couple of nights before, and the combination of the excitement at each was too contagious to resist, so we got into a discussion about open and closed loop systems, which I had mentioned in a previous blog post. Finally, I broached the seemingly taboo topic of how much data to you really need? Of course, if you know me, this led to the Nyquist-Shannon sampling theorem :) . Oh yeah, there was also mention of cats and Disaster Girl.

Check it out. The video and slides are embedded above. Please feel free to send questions or comments. I might even reply! Next blog, I’ll look at some more analytics algorithms. Yes, I promise (this is reminiscent of the “free beer tomorrow” sign I see at some pubs).

The DevOps Zone is brought to you in partnership with Sonatype Nexus. Use the Nexus Suite to automate your software supply chain and ensure you're using the highest quality open source components at every step of the development lifecycle. Get Nexus today


Published at DZone with permission of Jenny Yang, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}