Think for a moment about how we treat applications developed by us and their code. We are using version control systems to know exactly who, when, what and why have changed a particular line of code. We are performing reviews and care for quality. We are using pull requests to facilitate development. We have a bunch of good practices and design patterns. We have a lot of test automation implemented and are executing thousands of tests to prove each change is correct. We have staging environments and final smoke tests. We can go on with such a list for quite a long time. What about treating infrastructure in the same manner? Operating system, servers and Continuous Integration configuration can be as big and complex as many applications are. Why do not benefit from the above practices in the context of infrastructure? This is a short summary about what “Infrastructure as a code” or “Programmable Infrastructure” slogans are about.
The old (for majority probably current) approach is slightly different. Many administrators and developers install and maintain machines manually. The package or software is downloaded, integrated and started. A lot of admins create a bunch of scripts executing semi automated actions so as to not repeat mundane tasks. Some tries to create documentation and manuals and keep them up to date so they could maintain control over their managed systems. The best admins are working on full automation to reduce their work to a minimum. However, several problems still persist:
- no easy way to restore infrastructure state from some point in time,
- a lot of mundane and repeatable work,
- hard to maintain standardization across many machines,
- jugglery involved to keep everything up and running,
- often working on a living production system (no possibility to test the solution on staging env),
Depending on the specific setup, the list of problems may vary.
The “Infrastructure as code” approach tries to eliminate the problems mentioned above. It changes the way of thinking about the infrastructure. It treats it in the same manner as developed applications - the same concepts and good practices should be applied:
- version control system usage to track what happens on the machine - who, when, what and why have changed,
- easy restoration of previous states thanks to versioning,
- have automatic tests, if possible, even in unit style - execute them after each change,
- have a review process in place,
- fully automate - the actual work is spent only on automation system,
- have staging environments for your infrastructure changes,
- use development good practices, like keep it simple and avoid code duplications.
When I first met the concept it sounded to me like an impossible ideal. Is it?
In fact, there are a bunch of tools that have been created in the last few years that slowly allow us to implement it. Let’s get through a few of them:
- Chef and Puppet - provides full automation for your infrastructure. Define declaratively how you would like it to be and let the tools roll out changes. They will also ensure that the configuration will be preserved in the defined state. Keeping configuration in text format allows you to use GIT and pull requests to monitor and review changes.
- Docker - instead of maintaining running machines in proper state, prepare earlier images, deploy them instantly and flush them when they are not needed. The image configuration can be stored in text files too and can be versioned control. Previously built images are also versioned by Docker itself, and the differences can easily be checked. If you are not yet familiar with Docker, you should definitely check it out. It is one of the hottest technologies in the modern development world,
- Jenkins Job Builder - a tool not yet well known, but very useful for maintaining large Jenkins instances. As the openstack states: “In order to make the process of managing thousands of Jenkins jobs easier, Jenkins Job Builder was designed to take YAML based configurations and convert those into jobs that are injected into Jenkins.” Again, whole configurations can be versioned under GIT and many development good practices can be used (especially - no duplicates of the same configuration within hundreds of jobs),
- Build systems like Gradle, Maven or sbt - develop or use existing plugins which will setup all required infrastructure from scratch. Software will be automatically downloaded by dependencies mechanism, installed, configured and started.
“Infrastructure as a code” changes the administrator mindset. It makes them developers and requires from them programmatic skills. It perfectly matches the DevOps concept. Looking at it from the other side - taking care of infrastructure becomes interesting for developers and starts to no longer be considered as a necessary evil.
In the current tools and technology state implementing this approach takes quite a lot of time. Using it for small scale solutions and systems might (let me emphasise: might) not pay off. However, if you are managing several dozen configurations and machines you will notice the benefits quite fast. While implementing new changes might still take more time than in the old approach, full tracking, automation and gained stability will leverage the additional effort.
There will be less and less classical admins in the future. We have see this trend since many years ago. I also expect that “Infrastructure as a code” tools will go into direction when setup and usage will be so simple and user friendly that almost no one will consider configuring infrastructure without them.