“Will the future be Dockerized?” — A Discussion on Docker, Containers, and the Future of Application Delivery
At XebiaLabs, we know that building great tools is only part of it. What our users are especially looking for is thought leadership and trusted advice on Continuous Delivery and Devops automation.
For us, staying ahead of the game in terms of new developments and technologies is a top priority, and one in which we invest heavily. We're constantly in touch with experts and thought leaders in the application delivery space, discussing future delivery models and practices.
Here, I'd like to share some excerpts from a current thread related to Docker, container management tools in general, and the "container-model" application delivery style they support.
My short answer to the title question, by the way? For some types of applications and architectures, the technology is a very compelling option. But we have some significant challenges still to address, including:
- What is the right interface to the development organization? The Dockerfile, i.e. the full container definition? Some abstraction above it?
- How will we manage ownership and control of the various layers in a container?
- Once we figure that one out, how do we manage the combinatorial explosion associated with having different layers?
Uday: "Docker promotes ephemeral deployments where containers are really cheap to create and destroy (more efficient than VMs) and fits quite well in the cloud where applications are created with the assumption that they are bound to fail and should be able to handle the failure scenarios. What does that mean to developers/Ops?
Developers - Build fast, cheap and easy containerized environments they can use to develop against without worrying about provisioning and build applications with failure in mind.
Ops - Upgrades, new features & new service deployments etc. are now containerized so deployments are simply Dockerized images with latest features/code. This is a huge change compared to deploying and managing individual applications. Applications are kind of abstracted away by Docker and brings some consistency in managing various applications however different they might be.
What do you think? Do you think the future will be Dockerized?"
I certainly don't think all of the future will be containerized (many discussions seldom seem to touch on persistent data, for example).
Even though Docker is getting all the attention at present, this "virtual appliance" style deployment model isn't new - tech companies have been doing much the same with Amazon EC2 AMIs for a while, Twitter does something similar using Mesos, Vagrant has supported "whole environment" delivery for VirtualBox and now more providers for years, even if it's mainly used for local environments in practice. Arguably, this is also the deployment model mainframes use
From what I can see, the role this model will play will depend largely on two factors: future application architecture and future division of responsibility for what's in a container.
For those future applications where you're essentially looking at a lightweight web process, lightweight containers make a lot of sense - more so than full VMs, which are still pretty "fat" for something that's not much more than one OS process with a port binding. The current generation of micro-services architecture sketches are composed largely of such processes, but I will be curious to see what happens once we get on to talking about centralized lookup directories, version compatibility, format translation etc. etc., i.e. the kind of discussion we recall from service buses. Even today, there are application types for which lightweight processes are not ideal, e.g. data processing clusters.
Perhaps the more interesting challenge from an enterprise perspective is the question of ownership: when you create a container, you're effectively trusting the provider of that image to have "done things right": configured security correctly etc. The reason I call this delivery style the "virtual appliance" model is that it's not much different from having a physical appliance delivered and installing it in a rack: you have to trust the vendor to have done things properly.
Is that how Ops will relate to development deliverables - from outsourced providers, potentially - in future? It's certainly going to be a challenge, and I'm sure we'll see more work along the lines of "locking" the base image, preventing the "development part" of the image from running certain commands, etc.
Or will the pendulum swing back again towards PaaS, where the boundary between what the developer delivers and where the platform starts is very explicit, and typically at a higher level and narrower scope than a whole container? From the discussions we've had with larger organizations, this distribution of control/responsibility is a more natural fit for the way enterprises are currently structured, the current Devops buzz notwithstanding.
So perhaps the most natural short-to-medium term future is using container deployments under the protective cover of a higher-level abstraction such as is offered by a PaaS. Unsurprisingly, that's also where a lot of Docker adoption is happening: as an implementation component for platforms like OpenShift or Cloud Foundry etc.
Indeed, that (i.e. dotCloud) is also where Docker came from.
Managed VMs in the Google App Engine will be an interesting one to watch here in terms of an experiment in removing that cover.
Uday: "Obviously this isn't a mature technology yet. I'd imagine patterns (data volumes/containers acting only as data volumes) will emerge as containers get mainstream."
Very interesting idea, especially with storage technologies that are essentially file-system based, e.g. HDFS. I wonder what we'll see happening in this space.
Uday: "[Speaking of the trust question,] I'd imagine trusted vendors like Red Hat / Ubuntu (or perhaps someone new) will leverage their certified partner ecosystem to provide base images."
I think we're likely to see some kind of "layering" model here that will allow customizations on top of these base images in a controlled way. To me, one of most interesting features of Docker in this regard is precisely its file system layering approach. At present, it doesn't offer much fine-grained control, but I can easily imagine the following type of features:
- base images can mark certain parts of the file system as "off limits", i.e. not modifiable by overlays
- the "platform team" within an organization can mark certain parts of images as "off limits", and/or apply access control as to who can modify them with additional layers
- the platform team can enforce the fact that certain layers will always be mixed in, e.g. a layer that patches a vulnerable lib
- some kind of validation rules can be applied to the overall container definition (images and commands) to ensure everything conforms to certain rules
How/whether this can work with operating systems that do not do everything via the file system (a popular one out of Redmond comes to mind) remains a fascinating question.
Uday: "I look at containers as shipping boxes. When I order something online, the retailer does everything necessary to package my order safely and securely and the shipping company simply takes the item and transports it to its destination. There are a lot of intermediate hops before it reaches the destination but it's still the same box, no changes are made to the box.
A VM or a container are different types of boxes. It's up to the shipper which boxes they want to use to ship. Boxes though do not change how you build your product. Yes, if you want to reduce your shipping costs, you'll try and build something that can fit efficiently in a particular box but that's only if shipping is a major cost."
Interesting analogy. I certainly agree that the box (for suitably flexible box types) should not impact the content/product, as indeed all the vendors demonstrate that have a Vagrant base box, supported AMI, recommended Docker image etc. for the same thing. Containers as a shipping/deployment model are indeed a compelling story.
To continue the analogy a step further: the challenge I see here for large organizations is that you need to keep the boxes around. Unlike in the online retailer example, where you typically discard the box once you have taken your product out (and so it indeed doesn't matter which box was used), in this case, you basically need to keep and maintain the box somehow.
In other words, an enterprise functions almost as a large "container storage and processing facility", and in such a scenario you would most likely want to avoid
- having to support too many different container types or
- containers that contain potentially dangerous or known dangerous content.