`docker pull` can't resume when killed with spotty internet

Expected behavior

docker pull can be ctrl-cd and executed again and docker daemon will continue where it left off.

Actual behavior

I’m on Gogo right now (it’s slow and spotty). I try to pull an image. It gets some percentage of the way done and then just stops. If I kill the docker pull invocation and then try to run it again, it picks up where it left off but it never proceeds further than where I killed it. I have to actually restart Docker in macOS (aka the VM) before it will pull anything. Of course, that starts the layer over, so I’m basically unable to ever get the layer (until I land at least).

Information

  • the output of:
    • Moby Menu > Diagnose & Feedback on OSX
Docker for Mac: version: mac-v1.11.2-beta15
OS X: version 10.11.5 (build: 15F34)
logs: /tmp/20160618-190819.tar.gz
failure: No error was detected
[OK]     docker-cli
[OK]     app
[OK]     menubar
[OK]     virtualization
[OK]     system
[OK]     osxfs
[OK]     db
[OK]     slirp
[OK]     moby-console
[OK]     logs
[OK]     vmnetd
[OK]     env
[OK]     moby
[OK]     driver.amd64-linux
  • a reproducible case if this is a bug, Dockerfiles FTW
    n/a
  • page URL if this is a docs issue or the name of a man page
    n/a
  • host distribution and version ( OSX 10.10.x, OSX 10.11.x, Windows, etc )
    OS X 10.11.5 (15F34)

Steps to reproduce the behavior

  1. Start a docker image pull with fairly slow/spotty internet.
  2. Wait until the download stalls and stops (even though other Internet works).
  3. Ctrl-c the docker pull
  4. Start the docker pull again
  5. Notice that it doesn’t ever proceed past where it was when you killed it.

Meta

It’d be nice if I could open the forums without clobbering my clipboard (re: “Copy diag id to clipboard and open forum” button). Plus this template doesn’t even ask for the Diagnostics ID.

And my Diagnostic ID is 8337D04F-39A8-42F9-83C3-994544794087.

1 Like

me too. this is the output i get when i try to redo a pull on an image that was earlier interrupted due to bad internet connection:

docker@default:~$ docker pull jupyter/minimal-notebook
Using default tag: latest
latest: Pulling from jupyter/minimal-notebook
8b87079b7a06: Already exists
a3ed95caeb02: Already exists
b7837d46e7ab: Already exists
662ca9fe7b05: Already exists
97953f27b77a: Already exists
a1d35087ecd0: Already exists
c4db12d2b839: Already exists
b322dbef1dd0: Already exists
67eb38a48022: Already exists
e0ffb6fc1d47: Already exists
9475bfdcc016: Already exists
77e33b354adc: Already exists
ab0a9f379c27: Already exists
3b283206c60e: Already exists
df886c3898cb: Downloading [=================================>                 ]   412 MB/615.1 MB
docker@default:~$ 

the interrupted download can never seem to be able to go past the 412MB mark.
my environment:
docker-toolbox: 1.10.3
host: windows 7

I think I’m going to stop using docker because of this.
This is not logical at all having to re download a 2GB docker image every time.

+1 from me, but I’m actually using Docker for Windows inside a VMware Fusion on a Mac. I don’t know if that’s what’s causing the spotty connection (I get the same thing happening whether I’m using bridged or NAT networking), or if my host machine would have the same problem. Since I’m trying to download a windows container, I can’t actually use a docker pull on the image from the mac side, so right now I’ve spun up an AWS EC2 instance for the sole purpose of pulling this image, doing a docker save to a tar file, and then transferring it to my VM through other means.

Just to be clear, the behavior we are seeing is that any fully loaded fs layers are not re-downloaded, but if a layer stalls and is cancelled, it will ALWAYS restart from the beginning instead of resuming where it left off. I can usually download between 20-100 MB at a time successfully, but this windows layer is 4GB so I’m just never getting anywhere close.

Docker’s code isn’t as updated as the moby in development repository on github. People have been having issues for several years relating to this. I had tried to manually use several patches which aren’t in the upstream yet, and none worked decent.

The github repository for moby (docker’s development repo) has a script called download-frozen-image-v2.sh. This script uses bash, curl, and other things like JSON interpreters via command line. It will retrieve a docker token, and then download all of the layers to a local directory. You can then use ‘docker load’ to insert into your local docker installation.

It does not do well with resume though. It had some comment in the script relating to ‘curl -C’ isn’t working. I had tracked down, and fixed this problem. I made a modification which uses a “.headers” file to retrieve initially, which has always returned a 302 while I’ve been monitoring, and then retrieves the final using curl (+ resume support) to the layer tar file. It also has to loop on the calling function which retrieves a valid token which unfortunately only lasts about 30 minutes.

It will loop this process until it receives a 416 stating that there is no resume possible since it’s ranges have been fulfilled. It also verifies the size against a curl header retrieval. I have been able to retrieve all images necessary using this modified script. Docker has many more layers relating to retrieval, and has remote control processes (Docker client) which make it more difficult to control, and they viewed this issue as only affecting some people on bad connections.

I hope this script can help you as much as it has helped me:

Changes: fetch_blob function uses a temporary file for its first connection. It then retrieves 30x HTTP redirect from this. It attempts a header retrieval on the final url and checks whether the local copy has the full file. Otherwise, it will begin a resume curl operation. The calling function which passes it a valid token has a loop surrounding retrieving a token, and fetch_blob which ensures the full file is obtained.

The only other variation is a bandwidth limit variable which can be set at the top, or via “BW:10” command line parameter. I needed this to allow my connection to be viable for other operations.

Go here: https://pastebin.com/jWNbhUBd for the modified script.

In the future it would be nice if docker’s internal client performed resuming properly. Increasing the amount of time for the token’s validation would help tremendously…

Hi,
We are seeing this problem at the moment. I thought it should be fixed already. Can anyone confirm that resume should now work when pulling from a private account on dockerhub? Any suggestions on why it may not be working for us?
Thanks
Ed

Still experiencing the issue on my end, several years later.