Log Aggregation for Docker Containers in a Mesos and Marathon Cluster
How to gather Docker container logs using mounted volumes, syslogs, and the Docker API.
Join the DZone community and get the full member experience.
Join For Freethis article will describe several alternatives for gathering docker containers logs in the distributed environment of an apache marathon and mesos cluster, like syslog, container linking, docker rest api, embedded logging piping stdout/stderr, and mesos apis. we’ll go through a problem statement, different alternatives, and describe the challenges related to each of them.
by the way, the docker rest api combined with intelligent "docker inspect" hooks does a great job.
background
at elastic.io we have built an integration platform for developers, with the best possible environment to code, test and run integration flows. an integration flow is a sequence of integration components that are connected to each other. each integration component is an individual process running in a docker container that communicates via a persistent rabbitmq queue with the next component. we provide tooling and monitoring on top of that, and showing logs of integration components is an important part of it.
a logging problem in docker containers
we have a large number of docker containers and we need to aggregate logs from it so that we could show them to the users. docker containers are running inside apache mesos and scheduled with mesosphere marathon with varying number of mesos slaves. our goal is to support all programming languages (that are running inside docker containers), so we can’t really impose any specific logging framework. therefore, our options are limited to grabbing stdout and stderr and pushing them to persistent storage, e.g. to s3. which is, by the way, not too far away from the 12 factor apps logging concept that actually proves our point here.
another requirement for the solution is that we have to encrypt the log output. security is a very important part of what we do, and log output may contain sensitive information . so, we treat logs just like user data — logs have to be encrypted with a tenant-specific key.
implementation alternatives
after some googling, we identified following alternatives:
- alternative a: mounted volume – store container logs on the mounted volume and pick them up from there.
- alternative b: syslog – aggregate logs within a container and push them somewhere over the network, e.g. via syslog.
- alternative c: docker api – use a docker rest or cli api and attach to each container after the start.
alternative a: mounted volume
it’s a great solution: just mount /var/ logs to the outside of the container and use other tools like logstash to collect them. so, the main advantage is simplicity.
however, there are the following disadvantages:
- we don’t know what applications will be run inside our docker containers, assuming that logging will be pushed to the filesystem
- enforcing that logging will be done to a file is also against the 12 factor apps logging concept.
- as we work inside a mesos/marathon cluster, we would have to make sure that log-collector agents would be active on all mesos slaves.
- disc capacity; this would have been partially solved via a mesos sandbox, but would become a problem again when mounting to an outside volume.
this solution would be ideal for us if we had packaged existing applications. but as that's not the case, we decided not to proceed with this one. however, if you like it and it fits your case, here is a nice blog post about it.
alternative b: syslog
we could aggregate logs within a container and then just push them to syslog. syslog is a great unix tool for logging with a long history. hence, it is very stable and reliable.
this solution would involve connecting and exposing a syslog port in or outside of the container, and pushing data from the container into it. the best way to do that is by linking containers together . this solution has several advantages over the previous (filesystem mount) solution, like:
- no need to do filesystem mounts, network is simpler to use.
- syslog will get the logging information pushed, it doesn't have to poll the fs for that.
- more flexible deployment options.
there are, however, also some drawbacks:
- container needs to know about syslog and push the logs internally, so we're imposing some limitations on containers again.
- container linking concept compete with some of the mesos and marathon concepts. so, it doesn’t seem to be right to use container linking without a proper network virtualization layer.
- to minimize network load, we would have to deploy a docker container with syslog on all slaves.
alternative c: docker api
the docker api solution involves following steps:
- monitor the start of new docker containers on the mesos slaves.
- as soon as a new container is started, attach to it and grab all stdouts and stderrs from it.
- encrypt it and push to the appropriate storage (e.g. s3).
the advantage of this solution is that no assumption will be made on the application inside a container. all logs that are sent to stdout and stderr will be forwarded.
the drawbacks are, however, following:
- network communication is required, so we have to work with distributed ‘agents’ on localhost to minimize it.
- logging agents would need access to the docker api, so this represents a potential security issue.
- we would need a clever way to access the docker api and make it in a secure and reliable way.
- we would need to monitor the uptime of the logging agents to mitigate their failures.
more details about this solution below.
the solution — say hello to boatswain
as you might have already guessed, the very last solution is the one that we have implemented, and i have to say, so far it works like a charm. our distributed docker logging agent called boatswain and it does a great job aggregating logging information from docker containers that run on our mesos cluster. and guess what, it’s less than 100 code lines long.
we use the so called docker-allcontainers , that in its turn uses dockerode and never-ending-stream to access the docker api. boatswain will be notified about starts and stops for all new and existing containers from the local docker daemon:
var ac = allcontainers({
preheat: true,
// emit starts event for all already running containers
docker: null })
.on('start', listeners.oncontainerstart)
.on('stop', listeners.oncontainerstop);
you might start wondering how we connect to the docker api, but hang on a bit and you'll find the answer very soon.
when a new container starts, we do a quick docker inspect and then attach this to it:
function oncontainerstart(meta, container) {
console.log('container started: %s / %s', meta.image, meta.name);
q.ninvoke(container, 'inspect')
.then(attach)
.catch(error)
.done();
...
the resulting stream will be encrypted and pushed to s3. that’s it.
this app is packaged as a docker container. we then use apache marathon to start and monitor it.
obviously, somewhere along the way we've run into several other issues. some of them were:
how to connect to the docker api?
as our logger process is deployed as a marathon app, we needed a secure way to give it an access to docker daemon running on the mesos slave. pavel and george, our developers, found a nice way to do that – they just mounted a docker socket inside the docker container. here is how it looks like in our marathon app descriptor file:
{
"container": {
"type": "docker",
"volumes":
[
{
"containerpath": "/var/run/docker.sock",
"hostpath": "/var/run/docker.sock",
"mode": "rw"
}
]
}
}
as the container is running as root and the mesos daemon is also running as root (it has to start docker containers somehow too), we have a nice socket-based solution that imposes no network load at all. imho, it’s the best way to use the docker api from one of the marathon apps.
note: make sure your marathon version already supports the volumes in the configuration.
how to deploy logger collector app?
so, how do we make sure that each mesos slave has exactly one instance of boatswain up and running? a good solution here is marathon constraints , and this is our marathon app descriptor:
{
"id": "boatswain",
"constraints": [
[
"hostname",
"unique"
]
],
now we just need to scale our app to an exact number of slaves, and marathon will not only distribute boatswain to all slaves, but it will also make sure it will be restarted in case of a shutdown.
there is, however, one little problem: when we increase the number of slaves, we need to update the boatswain application descriptor. we could, of course, set a very large number of instances required in the first place and make sure that marathon only starts on each slave. however, this would also lead to the ‘pending’ status of boatswain deployment in marathon ui, which is not nice.
we still need to see what will be the best solution here.
how to know which app is running in which container?
as we make no assumptions about the code running inside containers, we have to find a reliable way to identify the containers and associate them with particular integration components of a particular tenant running on our system. this is quite a significant issue. what we have over the docker api is a container id which is essentially a randomly generated uuid. neither marathon nor mesos gives us a reliable way to transport any way of identification down to the docker container (e.g. name the container like marathon-app-slaveid-random or something similar).
the solution we found for this issue is to inspect the container beforehand. when launching an integration component on marathon, we give it a couple of environment variables so that for example it connects to rabbitmq and decrypts messages from there. with docker inspect, we gained access to the environment variables so that we could reliably identify the app inside the docker container and encrypt the log files with a tenant-specific key.
conclusion
we are quite happy with the resulting approach, it’s not only simple (<100 lines of code), but also a clever solution that uses technologies at hand and imposes no requirements on applications running in the docker containers on top of mesosphere marathon and apache mesos .
Published at DZone with permission of Renat Zubairov, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments