Docker has been quickly adopted by nearly everyone and incorporated into everything from cloud technologies, to continuous integration and build systems, to solo developers working exclusively on their laptop. Heck, even Microsoft is getting in on this! It was born in PaaS (dotCloud) and this is the place where it makes a lot of sense. Ephemeral fast-starting single-process containers that can be distributed across a large cluster is where Docker shines.
Docker has been Stackato's container implementation for a year now, responsible for provisioning and managing the life-cycle of who knows how many Linux containers. The next question is how do we start exposing Docker features to end users, rather than having them as an unexposed implementation detail. These features bring portability with a simple packaging mechanism for building and distributing an application in a consistent way across not only a specific PaaS, but anywhere that Docker runs.
Docker seems like the obvious choice for PaaS. Engineers building PaaS solutions are excited by it and many developers are banging down the door demanding it. There is no doubt every PaaS worth its salt will, at some point in the near future, implement mechanisms for developers to drop in their pre-built Docker images - or at least a Dockerfile.
But let's take a step back from this euphoria for a minute and look at the bigger picture. Is this really the right abstraction for developers? Docker brings a lot to the table, but as with everything there are pros and cons. So what do we lose here?
There are open-source Buildpacks for most programming languages (even COBOL!). These have been built and evolved by experts in those languages. The Buildpacks deal with the low-level system dependencies and everything about configuring the runtime on which to run these applications. Low-level system dependencies should never be the concern of the software developer. Cloud Foundry based PaaSes, like Stackato, also remove the need for developers to know or care about what a Buildpack actually is. The administrator of the cluster can install all the commonly used language and framework stacks via Buildpacks, and the PaaS will select the best Buildpack for the job - this is mechanism of the Buildpacks themselves, not the PaaS.
With Docker, in regards to PaaS support, we are expecting developers to bring their own pre-built Docker images. This unfortunately means that we are going backwards and now telling developers that they must create their stack themselves outside of the platform. The low-level system dependencies within the container are once again in the domain of the developer. Time spent figuring this stuff out is redundant, not the best use of this engineers expertise and therefore prone to error.
There are mechanisms and evolving best-practices for building Docker images that provide certain software stacks, removing the onus on developers to understand the low-level details. Some of these solutions are close to, or leverage, Buildpacks themselves. This is the right direction, but still requires systems that are outside of the consistent managed platform of PaaS and this leads to potential fragmentation across a large organization.
The second issue with Docker images being built outside of the PaaS should concern Operations teams the most. Vulnerabilities are a growing fact of life in managing deployed software. We have see many this year. When a vulnerability is announced Operations need to know two things: 1) are we affected 2) if so, how do we patch our systems quickly?
One direction that Buildpack development is going is with meta-data accountability. Currently, a Buildpack will look at the application code, decide which dependencies need to be installed and install them. After that, knowledge of what is installed in that container is lost. Retaining this information will be very powerful moving forward. For instance, as a PaaS administrator I should be able to query the system to find out exactly which applications are running a specific version of the Ruby or Java runtime binaries. Having this information at your fingertips the second that a vulnerability is announced will be incredibility powerful.
When we package our Docker images outside of the system and pass it into the PaaS, we're essentially giving the Operations team a black box and there is no easy way for them to determine all the things that may be installed in that black box. This is a big problem for a large scale organization when a serious vulnerability is announced. These are concerns that ActiveState are thinking about and we currently use Buildpacks to solve this issue. Unfortunately, we see few solutions here with Docker images, yet.
I see most PaaSes supporting both Docker images and Buildpacks for the near future. The demand for exposing Docker integration and the flexibility that Docker provides makes this a no-brainer. But Buildpacks still show a great deal of value and as we see enterprise-grade features such as accountability being built into Buildpacks. Therefore, it will be up to the enterprise as to which solution works better for them.