We are revolutionizing our integration platform as-a-service with the help of Docker containers, and we’ve been looking for an ideal solution for gathering Docker container logs in the distributed environment of Apache Marathon / Mesos cluster. We’ve tried different alternatives including Syslog, Container linking, Docker REST API, embedded logging piping stdout/stderr and Mesos APIs.
The solution that we found the best is probably not the universal one:-) but it certainly worked very well for us, so maybe you’ll find it interesting too. Actually, we wrote a whole blog article out of it, but I’ll post here only the first part in order NOT to clutter the Internet space:-)
BackgroundAt elastic.io we are building an integration platform for developers, with the best possible environment to code, test and run integration jobs or flows. Integration flow is a sequence of integration components that are connected to each other. Each integration component is a individual process running in a **Docker** container that communicates via persistent **RabbitMQ** queue with the next component. We provide tooling and monitoring on top of that so that, and BTW showing logs of integration components is an important part of it.
Logging problemWe have a large number of **Docker** containers and we need to aggregate logs from it so that we could show it to the users. Docker containers are running inside an **Apache Mesos** and scheduled with **Mesosphere Marathon** on varying number of Mesos Slaves. Our goal is to support all programming languages (that are running inside **Docker** containers) so we can't really impose any specific logging framework, therefore our options are limited to grabbing *STDOUT* and *STDERR* and pushing it to persistent storage e.g. S3. Which is by the way not too far away from the [12 Factor Apps logging concept] that actually proves our point here.
Another requirement for the solution is that we have to encrypt the log output. Security is very important part of what we do, and [log output may contain sensitive information] so we treat logs just like user data - logs have to be encrypted with tenant-specific key.
Implementation alternativesAfter some googling we identified following alternatives:
- Mounted volume: Store container logs on the mounted volume and pick it up from there
- Syslog: Aggregate logs within a container and push them somewhere over the network, e.g. via syslog
- Docker API: Use a Docker REST or CLI API and attach to each container after start
Alternative A: Mounted volumeIt's a great solution if we package existing applications, just mount */var/logs* to outside of the container and use other tools like **logstash** to collect them. So first advantage is **simplicity**.
However there are following disadvantages:
- We don’t know what applications will be run inside our Docker containers, assuming that logging will be pushed to the filesystem
- Enforcing that logging will be done to a file is also against [12 Factor Apps logging concept]
- As we are working inside a Mesos/Marathon cluster we would have make sure log-collector agents will be active on all Mesos slaves
- Disc capacity - partially solved via Mesos sandbox but when mounting to outside volume will become a problem again
We decided not to proceed with this one, if you like it here is a [nice blog post] about it.