Automate and Validate the Drift Away
Originally Authored by Sean Rousseau
Hello, salut, hola, ni hao, guten tag…
Glad to see you found us. We’ve been terribly remiss in the communications department lately. Sadly, this is what happens when everyone around here is an engineer… nobody wants to write anything, except code! But from here on out we’re going to try hard to change that. We promise to communicate more frequently and keep you up to date on all the cool goodies we’ve been building.
To start with, I’m happy to announce that we’ve released a few big features that our beta customers have been waiting for: scheduled snapshots, automated differencing, and email alerts. This latest release is all about making it easy to prevent, manage, and resolve a very special static anomaly — server drift. With these new capabilities, you will be able to troubleshoot environment bugs much faster than before and be warned of unexpected changes before they cause havoc.
Automated provisioning solutions such as Puppet and Chef reduce the chance of drift by ensuring uniform deployment of packages, services, and files. They are great tools, especially for the dynamic data center where nodes can have a very short existence. Automated provisioning is a necessary and good precaution against drift, but it doesn’t guarantee ongoing uniformity or even uniformity immediately after provisioning.
Machines which timeout during provisioning can miss out on installation. Virtual machines created during or after provisioning will miss out. Software downloads from online repositories may introduce differently versioned files from different servers. Then there are those servers we treat as pets, full of personality and ad hoc changes which are not propagated. All of these introduce inconsistencies into the system, anomalies which over time can deteriorate system integrity and cause errors and bad results, inconsistent application behavior between servers, diminished performance, and even downtime. An analogy that comes to mind is the network security firewall: firewalls offer a measure of protection, but they don’t prevent intrusions or problems from within, you still need intrusion detection technology to round out your security suite.
As pointed out by John E. Vincent in his blog Configuration Drift and Next-gen CM, configuration management systems aren’t assertive enough. They are designed to verify the state of a resource at the point they run. They don’t manage the state of those resources until they next inspect them. Configuration management tools don’t get run in response to those resources changing but in response to a user asking them to be checked. To compound the matter, without constant verification, drift can occur across nodes of different types whilst they share a verified common base block. Cliff Moon sums up the challenge pretty good: “Of all the problems to fix in Chef or Puppet, the diffusion and drift of state that occurs in idiomatic usage seems highest priority.” Our latest release fills the role pretty well I’d say.
If you’d like to get an email alert every time any of your servers drift out of their desired state, join our Beta. We’d love your feedback on our tool. Bug reports are welcome too