Duplicate logs appearing when querying container logs in docker 1.7.1

We’re currently implementing docker support for our log monitoring service, and have encountered a problem where duplicate log entries are sent when ‘following’ stdout/stderr logs using the docker socket (or using the cli).

The issue exists in docker 1.7.1, and I would have logged a bug on Github, except it appears to be fixed in the latest stable version (1.8.3) - although I can’t find any tracked issue that addresses this specific problem there are quite a few that address ‘follow’ behaviour so I suspect it might have been fixed in the crossfire.

It also doesn’t help us, as we are not in a position to control the version of docker our clients are running and it’s currently not realistic to expect they will all be running the latest version.

That being the case, I was wondering if anyone has experienced something similar and knows of a possible workaround that doesn’t involve fragile compensation code in our our application that tries to remove duplicates manually.

I’ve included below a simple test program that illustrates the core of the issue, which seems to be, some output, followed by a pause, followed by some more output, which results in duplicate logs being sent over the docker API or cli.

Version info for docker being used for testing:

Client version: 1.7.1
Client API version: 1.19
Go version (client): go1.4.2
Git commit (client): df2f73d-dirty
OS/Arch (client): linux/amd64
Server version: 1.7.1
Server API version: 1.19
Go version (server): go1.4.2
Git commit (server): df2f73d-dirty
OS/Arch (server): linux/amd64

Steps to reproduce:

Save the following as stdout.py

import time
import sys

for i in range( 0, 5 ):
  print( "Printing line %d" % i )
  sys.stdout.flush()

print( "----" )
sys.stdout.flush()

time.sleep( 5 )

for i in range( 5, 10 ):
  print( "Printing line %d" % i )
  sys.stdout.flush()

Then get a python 2 container and run the script to demonstrate correct output:

docker pull python:2
docker run --rm --name stdout -v "$PWD":/usr/src/myapp -w /usr/src/myapp python:2 python stdout.py

This will show:

Printing line 0
Printing line 1
Printing line 2
Printing line 3
Printing line 4
----
Printing line 5
Printing line 6
Printing line 7
Printing line 8
Printing line 9

With a short pause in between the first 5 and the second 5 lines.

If I now run a new container and read the logs from the container while it is running, then the first 5 lines of output are duplicated:

docker run -d --name stdout -v "$PWD":/usr/src/myapp -w /usr/src/myapp python:2 python stdout.py && docker logs -t -f --tail=100 stdout

Which will produce output like:

2015-10-27T08:34:02.447894577Z Printing line 0
2015-10-27T08:34:02.447962742Z Printing line 1
2015-10-27T08:34:02.447971906Z Printing line 2
2015-10-27T08:34:02.447971990Z Printing line 3
2015-10-27T08:34:02.447972073Z Printing line 4
2015-10-27T08:34:02.448410164Z ----
2015-10-27T08:34:07.456426850Z Printing line 0
2015-10-27T08:34:07.456426850Z Printing line 1
2015-10-27T08:34:07.456426850Z Printing line 2
2015-10-27T08:34:07.456426850Z Printing line 3
2015-10-27T08:34:07.456426850Z Printing line 4
2015-10-27T08:34:07.456426850Z ----
2015-10-27T08:34:07.456426850Z Printing line 5
2015-10-27T08:34:07.456426850Z Printing line 6
2015-10-27T08:34:07.456426850Z Printing line 7
2015-10-27T08:34:07.456426850Z Printing line 8
2015-10-27T08:34:07.456426850Z Printing line 9

You can see that the first 6 lines (0-4 and ----) are duplicated, and the duplicates all have the same timestamp as the first line of new output (line 5).

If I check the actual container logs on disk there are no duplicates. Likewise, if I run the log command after the container has exited then there is no problem, and neither is there a problem if I don’t try to ‘follow’ the log.

We do however need to follow the logs while the container is running, so if anyone has experienced a similar thing and can help with some sort of workaround, it would be appreciated.