These are the things I'm trying out to keep their Chef codebases (and the infrastructure they control) in shape:
- Lint stuffs: Syntax checking (rb,irb, conf files etc), style checks, some best practices check (like check for Chef solo), mostly using foodcritic wrapped in rake running it under the Go engine (this setup was done by nikhil initally)
- Check for context level best practices (check for defined environments, nodes with an empty runlist, number of updated resources after two consecutive runs [to check idempotency], direct asignment of recipes [always via role] etc) using rspec/Chef API and rake. This is more like integration test.
- Infrastructure test: triggering nrpe based tests or minitest report handlers to acknowledge the service provisioning has taken place correctly.
- Versioning cookbooks, version freezing cookbooks per environment and above all enforcing conventions like (app_project_environment) . The rest of the checks handle the tooling by exploiting these conventions. Anything that does not adhere to these are bound to become a work of art.
- Measuring most of the stuff using defined states and quantifiiable metrics (if possible), and then graph it (nagios/nrpe and graphite)
- Having a common understanding of what goes where (definition? library? lwrp? mulitple recipes?) inside a Chef codebase.
- And right now, I'm in the process of setting up a CI server to test the whole community+our own cookbooks against Ubuntu/CentOS containers using openvz against our own build pipelines.
Lastly, if you are building SaaS or PaaS, you are bound to hit the volume of Chef/Puppet/Cfengine scripts that will need their own CI.