We’re currently implementing docker support for our log monitoring service, and have encountered a problem where duplicate log entries are sent when ‘following’ stdout/stderr logs using the docker socket (or using the cli).
The issue exists in docker 1.7.1, and I would have logged a bug on Github, except it appears to be fixed in the latest stable version (1.8.3) - although I can’t find any tracked issue that addresses this specific problem there are quite a few that address ‘follow’ behaviour so I suspect it might have been fixed in the crossfire.
It also doesn’t help us, as we are not in a position to control the version of docker our clients are running and it’s currently not realistic to expect they will all be running the latest version.
That being the case, I was wondering if anyone has experienced something similar and knows of a possible workaround that doesn’t involve fragile compensation code in our our application that tries to remove duplicates manually.
I’ve included below a simple test program that illustrates the core of the issue, which seems to be, some output, followed by a pause, followed by some more output, which results in duplicate logs being sent over the docker API or cli.
Version info for docker being used for testing:
Client version: 1.7.1
Client API version: 1.19
Go version (client): go1.4.2
Git commit (client): df2f73d-dirty
OS/Arch (client): linux/amd64
Server version: 1.7.1
Server API version: 1.19
Go version (server): go1.4.2
Git commit (server): df2f73d-dirty
OS/Arch (server): linux/amd64
Steps to reproduce:
Save the following as stdout.py
import time
import sys
for i in range( 0, 5 ):
print( "Printing line %d" % i )
sys.stdout.flush()
print( "----" )
sys.stdout.flush()
time.sleep( 5 )
for i in range( 5, 10 ):
print( "Printing line %d" % i )
sys.stdout.flush()
Then get a python 2 container and run the script to demonstrate correct output:
docker pull python:2
docker run --rm --name stdout -v "$PWD":/usr/src/myapp -w /usr/src/myapp python:2 python stdout.py
This will show:
Printing line 0
Printing line 1
Printing line 2
Printing line 3
Printing line 4
----
Printing line 5
Printing line 6
Printing line 7
Printing line 8
Printing line 9
With a short pause in between the first 5 and the second 5 lines.
If I now run a new container and read the logs from the container while it is running, then the first 5 lines of output are duplicated:
docker run -d --name stdout -v "$PWD":/usr/src/myapp -w /usr/src/myapp python:2 python stdout.py && docker logs -t -f --tail=100 stdout
Which will produce output like:
2015-10-27T08:34:02.447894577Z Printing line 0
2015-10-27T08:34:02.447962742Z Printing line 1
2015-10-27T08:34:02.447971906Z Printing line 2
2015-10-27T08:34:02.447971990Z Printing line 3
2015-10-27T08:34:02.447972073Z Printing line 4
2015-10-27T08:34:02.448410164Z ----
2015-10-27T08:34:07.456426850Z Printing line 0
2015-10-27T08:34:07.456426850Z Printing line 1
2015-10-27T08:34:07.456426850Z Printing line 2
2015-10-27T08:34:07.456426850Z Printing line 3
2015-10-27T08:34:07.456426850Z Printing line 4
2015-10-27T08:34:07.456426850Z ----
2015-10-27T08:34:07.456426850Z Printing line 5
2015-10-27T08:34:07.456426850Z Printing line 6
2015-10-27T08:34:07.456426850Z Printing line 7
2015-10-27T08:34:07.456426850Z Printing line 8
2015-10-27T08:34:07.456426850Z Printing line 9
You can see that the first 6 lines (0-4 and ----) are duplicated, and the duplicates all have the same timestamp as the first line of new output (line 5).
If I check the actual container logs on disk there are no duplicates. Likewise, if I run the log command after the container has exited then there is no problem, and neither is there a problem if I don’t try to ‘follow’ the log.
We do however need to follow the logs while the container is running, so if anyone has experienced a similar thing and can help with some sort of workaround, it would be appreciated.