We’re currently implementing docker support for our log monitoring service, and have encountered a problem where duplicate log entries are sent when ‘following’ stdout/stderr logs using the docker socket (or using the cli).
The issue exists in docker 1.7.1, and I would have logged a bug on Github, except it appears to be fixed in the latest stable version (1.8.3) - although I can’t find any tracked issue that addresses this specific problem there are quite a few that address ‘follow’ behaviour so I suspect it might have been fixed in the crossfire.
It also doesn’t help us, as we are not in a position to control the version of docker our clients are running and it’s currently not realistic to expect they will all be running the latest version.
That being the case, I was wondering if anyone has experienced something similar and knows of a possible workaround that doesn’t involve fragile compensation code in our our application that tries to remove duplicates manually.
I’ve included below a simple test program that illustrates the core of the issue, which seems to be, some output, followed by a pause, followed by some more output, which results in duplicate logs being sent over the docker API or cli.
Version info for docker being used for testing:
Client version: 1.7.1 Client API version: 1.19 Go version (client): go1.4.2 Git commit (client): df2f73d-dirty OS/Arch (client): linux/amd64 Server version: 1.7.1 Server API version: 1.19 Go version (server): go1.4.2 Git commit (server): df2f73d-dirty OS/Arch (server): linux/amd64
Steps to reproduce:
Save the following as stdout.py
import time import sys for i in range( 0, 5 ): print( "Printing line %d" % i ) sys.stdout.flush() print( "----" ) sys.stdout.flush() time.sleep( 5 ) for i in range( 5, 10 ): print( "Printing line %d" % i ) sys.stdout.flush()
Then get a python 2 container and run the script to demonstrate correct output:
docker pull python:2 docker run --rm --name stdout -v "$PWD":/usr/src/myapp -w /usr/src/myapp python:2 python stdout.py
This will show:
Printing line 0 Printing line 1 Printing line 2 Printing line 3 Printing line 4 ---- Printing line 5 Printing line 6 Printing line 7 Printing line 8 Printing line 9
With a short pause in between the first 5 and the second 5 lines.
If I now run a new container and read the logs from the container while it is running, then the first 5 lines of output are duplicated:
docker run -d --name stdout -v "$PWD":/usr/src/myapp -w /usr/src/myapp python:2 python stdout.py && docker logs -t -f --tail=100 stdout
Which will produce output like:
2015-10-27T08:34:02.447894577Z Printing line 0 2015-10-27T08:34:02.447962742Z Printing line 1 2015-10-27T08:34:02.447971906Z Printing line 2 2015-10-27T08:34:02.447971990Z Printing line 3 2015-10-27T08:34:02.447972073Z Printing line 4 2015-10-27T08:34:02.448410164Z ---- 2015-10-27T08:34:07.456426850Z Printing line 0 2015-10-27T08:34:07.456426850Z Printing line 1 2015-10-27T08:34:07.456426850Z Printing line 2 2015-10-27T08:34:07.456426850Z Printing line 3 2015-10-27T08:34:07.456426850Z Printing line 4 2015-10-27T08:34:07.456426850Z ---- 2015-10-27T08:34:07.456426850Z Printing line 5 2015-10-27T08:34:07.456426850Z Printing line 6 2015-10-27T08:34:07.456426850Z Printing line 7 2015-10-27T08:34:07.456426850Z Printing line 8 2015-10-27T08:34:07.456426850Z Printing line 9
You can see that the first 6 lines (0-4 and ----) are duplicated, and the duplicates all have the same timestamp as the first line of new output (line 5).
If I check the actual container logs on disk there are no duplicates. Likewise, if I run the log command after the container has exited then there is no problem, and neither is there a problem if I don’t try to ‘follow’ the log.
We do however need to follow the logs while the container is running, so if anyone has experienced a similar thing and can help with some sort of workaround, it would be appreciated.