Configurations: The Smell
Does your server have a long list of items in its configuration file? This may hint that we have a configuration smell, as well as a code smell. You see, it’s not just that configurations are hard to manage. You can generally refer to them as kind of a computer language, a DSL, which has (in most cases) no compiler to assist in finding errors.
In addition, we usually do not get much power from the IDE when working with configurations as opposed to working with standard computer languages. In extreme situations, you may find yourself programming with configurations instead of a proper programming language. If you have been in such a situation, you know how painful and error-prone that process can be.
Having said all the above, there are of course cases, where, many configurations do make sense. If you are developing a framework, a generic server that is expected to be utilized by many users in many different directions and published publicly, then obviously, you need to react to the demands of different developers after the software has been released to the world.
In this case, the configuration is an inherent part of your application. Your application does not make sense without the ability to dynamically change these configurations. This doesn’t mean all configurations must be exposed right away. In fact, you find it best if the exposed part is close to zero by default and if exposed then with samples — it’s actually best for the software to be dynamic adapting to various users without the need for users to configure it. In this way, you let users update and add configurations only when they really must do so, and in these cases, using proper samples.
As for microservices, too many configurations for one microservice means the service is not really a microservice. Your microservice is doing too much, thus breaking the single microservice responsibility principle. In addition, your configuration is a kind of a deployment state for your app, which usually mutates over time. If it’s not, why was there such a configuration in the first place?
As we know, containers thankfully drag us into immutable deployments, and a plain container could be many times ephemeral. This does not always go well with any kind of mutation. We end up with two results. One result is too many configurations for what seem to be microservices. Another result is a global shift in deployment architecture that is caused by containers.
We end up with a collision in methodologies, and you know what? We love this collision! This collision means we are in a great position for taking a closer look at our configurations. It means we had better refactor them to reflect better software deployment processes, with or without containers. Containers are only the trigger for us to take a step back looking deeper into our configuration methodology in order to make them cleaner and make our peer developers’ lives easier!
Configurations: The Process
Today, the process for having your app use a configuration usually involves the following three steps:
- Finding a home to manage and update the external configurations.
- Fetching the configurations from a remote location to your node.
- Processing the configurations by your app.
By using a deployment configuration tool such as Puppet, your configuration can be part of your Puppet scripts and as such would be planted in your nodes by Puppet agent. Your app would read it from a well-known location. Another similar but more dynamic option is that your configuration is stored in a database and you read it into a local node. The general idea is the same — reading configurations from a remote to a local app.
Configurations: Moving to Containers
In the non-containerized world, you usually control the host explicitly where your application instance is installed; therefore, you are inclined to tell the configuration manager to plant the configuration file (yaml/properties/json/xml/…) on that host. Your application started and you assumed it could read a configuration file from a well-known location. In the containerized world, you have (as you should) much less control about the node on which the container resides.
This means you should not trust your file system to contain any externally planted configurations. In addition, you are going to have many containers of the same app type, on the same node, which is usually not the case in non-containerized deployment. Taking the above into account, you have to reconsider the configuration strategy and the planned path. Usually, the following paths are taken into consideration when redesigning the configuration methodology for containerized deployments (note that containers are evolving quickly, to the point in which parts of this series of articles might be out of date by the time you read it).
The list below describes possible paths for configurations and containers that work together:
- Common configuration. You might have common configurations duplicated between multiple apps. Refactor to reuse it and thus clean up your conf.
- No configuration. If your app has no external configuration, you have no configuration to manage. You are good to go.
- Refactor to a microservice. If your application contains only a small set of configurations, it’s going to be convenient to pass them as arguments to your containers with minimal change to your app. You are good to go.
- Read configurations from a shared storage. If your configurations are stored in a centralized database or storage and you read them remotely as needed, you only need to bootstrap your system with the datastore connection parameters (which is itself a configuration), but you could do that with command line arguments to the container.
- New build for new conf. If you need to change a configuration, create a new Docker image build, which makes it a non-configuration. This option must be considered, as well.
- Take them from the command line. Even though you have a large set of configurations, you pass them as complex strings from the command line. For example, you pass a JSON value for a configuration with the -e parameter to the container.
- Take them from the environment. Trust a third party to pre-set OS environment variables when your container starts. Tools such as Kubernetes ConfigMap allow you to do that; then your app internally reads those OS environment variables as its configuration. For more information, see the 12-factor app config section.
- Fetch from a Git repository. Fetch your configurations dynamically when your container starts from a remote Git repository. It’s a repeating theme, but the idea is that you can host these configurations somewhere remotely.
- Read from node’s pre-plant file. Trust a third party to pre-plant a file externally in the node itself and then read the configurations from there.
- A combination of the above. Usually, a single solution will not suffice and you will need a combination of the above solutions.
Don’t Forget the Developer’s Local Environment
When you choose a path, do not forget the developer’s local environment. Do they run Docker and an Orchestrator locally? If not, and you have chosen one of the paths above, you may have just made a developer’s life much more difficult! Either look for a seamless way for them to use the same Docker and Orchestrator locally or find an easy path for local environments, as well!
There are many times that when we design the deployment of an application considering different environments and installations, and you forget a single environment — the development environment. In many cases, the way you run your software on a developer machine might be different from the way you run it in production. Wasn’t the whole purpose of Docker containers to ensure an application is run the same way in any environment? Wasn’t this included in the developer’s local environments?
In general, the answer is yes, but do you really think the developer is going to build a new Docker image every time he is going to make a minor change in the application? Of course not. This means the developer is running the node server by himself, that his configurations are different, that it could be passed by the IDE, and that it may not be fetched from any external repository.
Resiliency to Failures
You happily implemented the path configuration management and stored all of your configurations perhaps in a remote database (or you trust the Orchestrator to pass them). But, what if there is any problem in a remote database or any remote location where the configuration resides?
A good practice is to also store the configuration locally using either environment variables or local files. In this way, in the case of a network failure or remote database access failure, you still have your configurations locally. In most cases, if you followed environment variables based configurations, all should be well. The problem is that if you restart your container and it tries to fetch a remote configuration at this exact point, a remote database error may occur, and your container would fail to restart.
Its mitigation in the old world was simple. Any configuration could be cached locally in files, and you could restart your service even if your database was down (if you had built it to be resilient, which you should have). In today’s environment, you dislike such locally mutated context and thus you have a problem. Companies are more heavily reliant on external system uptime. Unfortunately, this is not easy to solve without introducing new problems in the containerized world, or by using a shared storage.
If you don’t want to rely on the node in which your container starts, it wouldn’t make sense that every node would store micro services configuration cache locally. The file system and the local caches would have turned into a cluster-wide cache, which the orchestrator and the deployer manage. As current container deployment methodology states: treat your data center as one big machine with many resources. In this case, you trust the orchestrator to at least plant the rightful defaults and configurations in the container’s filesystem and environment variables.
Handling configurations while adopting container technology is a challenge. You either need to strip microservices from their configurations or take a more network-based approach. Your data center is more of like one big machine now, which includes a network used to fetch configurations. This means that you either have more challenges to solve or you mitigate the challenges so they are non-existent.
Whichever direction you choose, do not forget the local developer’s environment. Do not expect the developer to have the same environment as in QA or production. Devs use IDE while production and QA do not. This requires special attention. Choose a path that makes your process as seamless as possible with the minimal set of changes. The developer’s local environment should be both supported and productive, without introducing new procedures and overhead.