Services stuck in "Pending" when using cloudstor:azure volumes

I’m trying to use the latest cloudstor:azure plugin, and getting the ping1 example working, as described here:

I created the volume, and it appears to be correct. I try and mount the volume in the service when I create it. It shows five replicas, but all of them are pending, and I do not understand why. Trying to get docker service logs on the service results in an error Unrecognized input header: 69, so I’m a little stumped.

Running the same command without the volume mount works just fine, so it’s something about the volume that makes it unhappy. Thoughts?

Additional Information

I created a three-node swarm manually (not using ACS), running Ubuntu 16.04LTS, and installed the latest docker images.
hoylu@swarm-01:~$ docker --version Docker version 17.03.0-ce, build 60ccb22 hoylu@swarm-01:~$ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS exztq8kxi8ruc4ocmv1xalp1f swarm-03 Ready Active g92hvwpvk84wweoks3pyr4zv0 swarm-02 Ready Active l7bopju9jz3iw31uquckhr5pa * swarm-01 Ready Active Leader hoylu@swarm-01:~$ docker plugin ls ID NAME DESCRIPTION ENABLED 6f685229a34a cloudstor:azure cloud storage plugin for Docker true hoylu@swarm-01:~$ docker volume ls DRIVER VOLUME NAME cloudstor:azure mysql-volume hoylu@swarm-01:~$ docker network ls NETWORK ID NAME DRIVER SCOPE d15099c8dd93 bridge bridge local ca3df71e2a05 docker_gwbridge bridge local 1e551bbaeaa3 host host local rzhlhvs7ufuk ingress overlay swarm e97bdeaadadd none null local hoylu@swarm-01:~$ docker volume ls DRIVER VOLUME NAME cloudstor:azure mysql-volume hoylu@swarm-01:~$ docker volume inspect mysql-volume [ { "Driver": "cloudstor:azure", "Labels": {}, "Mountpoint": "/mnt/cloudstor/mysql-volume", "Name": "mysql-volume", "Options": {}, "Scope": "local" } ] hoylu@swarm-01:~$ docker service ls ID NAME MODE REPLICAS IMAGE hoylu@swarm-01:~$ docker service create --replicas 5 --name ping1 --mount type=volume,volume-driver=docker4x/cloudstor:azure-v17.03.0-ce,source=mysql-volume,destination=/shareddata alpine ping docker.com tys1uop9p67rndalic9fhn701 hoylu@swarm-01:~$ docker service ls ID NAME MODE REPLICAS IMAGE tys1uop9p67r ping1 replicated 0/5 alpine:latest hoylu@swarm-01:~$ docker service ps tys1uop9p67r ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS 7vbv3gi1ga92 ping1.1 alpine:latest Running Pending 11 seconds ago lw51wrp2gnbz ping1.2 alpine:latest Running Pending 11 seconds ago xsw85sjdgku6 ping1.3 alpine:latest Running Pending 11 seconds ago h0sl2brmqcpm ping1.4 alpine:latest Running Pending 11 seconds ago q9f0xcxfr6hz ping1.5 alpine:latest Running Pending 11 seconds ago hoylu@swarm-01:~$ docker service logs tys1uop9p67r Unrecognized input header: 69

@mkporwit Using cloudstor in a custom setup (i.e. outside of Docker4AWS/Azure) is not really supported at the moment however you can configure it to work. Did you install cloudstor with the appropriate parameters? You do need to point it to a Azure Storage Account that will back the File Storage cloudstor will use.

docker plugin install --alias cloudstor:azure --grant-all-permissions docker4x/cloudstor:azure-v17.03.0-ce CLOUD_PLATFORM=AZURE AZURE_STORAGE_ACCOUNT_KEY="$SA_KEY" AZURE_STORAGE_ACCOUNT="$SA_NAME"

where SA_NAME can be the name of one of the storage accounts already provisioned in the resource group and SA_KEY is the Storage Account for the storage account that you can obtain from the Azure Portal/CLI.

I believe I followed all of those steps. Here’s the output of docker plugin inspect:
[ { "Config": { "Args": { "Description": "", "Name": "", "Settable": null, "Value": null }, "Description": "cloud storage plugin for Docker", "Documentation": "https://docs.docker.com/engine/extend/plugins/", "Entrypoint": [ "/cloudstor" ], "Env": [ { "Description": "", "Name": "CLOUD_PLATFORM", "Settable": [ "value" ], "Value": "" }, { "Description": "", "Name": "AZURE_STORAGE_ACCOUNT", "Settable": [ "value" ], "Value": "" }, { "Description": "", "Name": "AZURE_STORAGE_ACCOUNT_KEY", "Settable": [ "value" ], "Value": "" }, { "Description": "", "Name": "EFS_ID_REGULAR", "Settable": [ "value" ], "Value": "" }, { "Description": "", "Name": "EFS_ID_MAXIO", "Settable": [ "value" ], "Value": "" }, { "Description": "", "Name": "DEBUG", "Settable": [ "value" ], "Value": "" } ], "Interface": { "Socket": "cloudstor.sock", "Types": [ "docker.volumedriver/1.0" ] }, "Linux": { "AllowAllDevices": false, "Capabilities": [ "CAP_SYS_ADMIN", "CAP_DAC_OVERRIDE", "CAP_DAC_READ_SEARCH" ], "Devices": [ { "Description": "", "Name": "", "Path": "/dev/fuse", "Settable": null } ] }, "Mounts": null, "Network": { "Type": "host" }, "PropagatedMount": "/mnt", "User": {}, "WorkDir": "", "rootfs": { "diff_ids": [ "sha256:3beba198b2072d10743da99a8bbb6ebfdfbbbca4d8432c30f462d0b6895e6069" ], "type": "layers" } }, "Enabled": true, "Id": "6f685229a34ad3b9ca2ac7ff1a51ba75468ba7d72bf5583e42167b6130444d50", "Name": "cloudstor:azure", "PluginReference": "docker4x/cloudstor:azure-v17.03.0-ce", "Settings": { "Args": [], "Devices": [ { "Description": "", "Name": "", "Path": "/dev/fuse", "Settable": null } ], "Env": [ "CLOUD_PLATFORM=AZURE", "AZURE_STORAGE_ACCOUNT=XXXXXXXXXXXX", "AZURE_STORAGE_ACCOUNT_KEY=YYYYYYYYYYYYY", "EFS_ID_REGULAR=", "EFS_ID_MAXIO=", "DEBUG=" ], "Mounts": [] } } ]

@ddebroy In fact, I went back in my command history, and found the command used to install the plugin:
sudo docker plugin install --alias cloudstor:azure --grant-all-permissions docker4x/cloudstor:azure-v17.03.0-ce CLOUD_PLATFORM=AZURE AZURE_STORAGE_ACCOUNT_KEY="YYYYYYYYYYYY" AZURE_STORAGE_ACCOUNT="XXXXXXXXXXXX"

At this point I’d settle for figuring out why docker service logs croaks… maybe that would provide more insight into why the services are stuck in Pending.

@mkporwit Can you try starting the service with a constraint set on a particular node (as well as mounting a cloudstor volume) and grep for the plugin ID of cloudstor in /var/log/docker.log in that node?

grep $(docker plugin ls | grep cloudstor | cut -d' ' -f 1) /var/log/docker.log

You should see some errors popping up if cloudstor is having issues interacting with Azure for some reason (e.g. unreachable storage account/incorrect access key). Here’s how things look if everything is going well as in the case of sharedvol9 below.

Mar 22 18:09:11 moby root: time="2017-03-22T18:09:11Z" level=info msg="time=\"2017-03-22T18:09:11Z\" level=debug msg=\"request accepted\" name=sharedvol9 operation=get " plugin=3a7e9f999403cc75196256ec6376462a8c1ae2611da4701e1c6d7642a7a82d9d Mar 22 18:09:11 moby root: time="2017-03-22T18:09:11Z" level=info msg="time=\"2017-03-22T18:09:11Z\" level=error msg=\"could not fetch metadata: cannot read metadata: open /etc/docker/plugins/cloudstor/volumes/sharedvol9: no such file or directory\" name=sharedvol9 operation=get " plugin=3a7e9f999403cc75196256ec6376462a8c1ae2611da4701e1c6d7642a7a82d9d Mar 22 18:09:11 moby root: time="2017-03-22T18:09:11Z" level=info msg="time=\"2017-03-22T18:09:11Z\" level=debug msg=\"request accepted\" name=sharedvol9 operation=create options=map[] " plugin=3a7e9f999403cc75196256ec6376462a8c1ae2611da4701e1c6d7642a7a82d9d Mar 22 18:09:11 moby root: time="2017-03-22T18:09:11Z" level=info msg="time=\"2017-03-22T18:09:11Z\" level=debug msg=\"request accepted\" name=sharedvol9 operation=path " plugin=3a7e9f999403cc75196256ec6376462a8c1ae2611da4701e1c6d7642a7a82d9d Mar 22 18:09:14 moby root: time="2017-03-22T18:09:14Z" level=info msg="time=\"2017-03-22T18:09:14Z\" level=debug msg=\"request accepted\" name=sharedvol9 operation=get " plugin=3a7e9f999403cc75196256ec6376462a8c1ae2611da4701e1c6d7642a7a82d9d Mar 22 18:09:14 moby root: time="2017-03-22T18:09:14Z" level=info msg="time=\"2017-03-22T18:09:14Z\" level=debug msg=\"request accepted\" name=sharedvol9 operation=mount " plugin=3a7e9f999403cc75196256ec6376462a8c1ae2611da4701e1c6d7642a7a82d9d Mar 22 18:16:03 moby root: time="2017-03-22T18:16:03Z" level=info msg="time=\"2017-03-22T18:16:03Z\" level=debug msg=\"request accepted\" operation=list " plugin=3a7e9f999403cc75196256ec6376462a8c1ae2611da4701e1c6d7642a7a82d9d

@ddebroy Kicked off a new service using:
docker service create --replicas 1 --name pin1 --mount type=volume,volume-driver=docker4x/cloudstor:azure-v17.03.0-ce,source=mysql-volume,destination=/shareddata --constraint 'node.id=l7bopju9jz3iw31uquckhr5pa' --name helloworld alpine ping google.com

Still get the following:
hoylu@swarm-01:~$ docker service ps xkvmwj9ftnu1 ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS msvdzwjzeq2c helloworld.1 alpine:latest Running Pending 41 seconds ago

This being a newer Ubuntu instance, it doesn’t have a /var/log/docker.log, and then only log entries for docker.service are these:
hoylu@swarm-01:~$ journalctl -u docker.service -- Logs begin at Wed 2017-03-22 19:08:43 UTC, end at Wed 2017-03-22 22:08:12 UTC. -- Mar 22 22:02:37 swarm-01 dockerd[12140]: time="2017-03-22T22:02:37.030504499Z" level=error msg="Error creating service helloworld: rpc error: code = 2 desc = constraint expected one operator from ==, !=" Mar 22 22:02:37 swarm-01 dockerd[12140]: time="2017-03-22T22:02:37.030877812Z" level=error msg="Handler for POST /v1.26/services/create returned error: rpc error: code = 2 desc = constraint expected one operator from ==, !="

They’re not relevant, they’re caused by a typo when I first specified the constraint.

Or did I miss something in your suggestions?

@mkporwit Can you enable the Debug flag in the daemon and retry please? https://docs.docker.com/engine/admin/#configuring-docker

@ddebroy Here you go. Looks like just warnings to me, but you’re the expert here :slight_smile:
hoylu@swarm-01:~$ journalctl -u docker.service | grep $(docker plugin ls | grep cloudstor | cut -d' ' -f 1) Mar 23 00:02:15 swarm-01 dockerd[45686]: time="2017-03-23T00:02:15.229641461Z" level=warning msg="libcontainerd: client is out of sync, restore was called on a fully synced container (6f685229a34ad3b9ca2ac7ff1a51ba75468ba7d72bf5583e42167b6130444d50)." Mar 23 00:02:15 swarm-01 dockerd[45686]: time="2017-03-23T00:02:15.230538812Z" level=warning msg="libcontainerd: failed to retrieve container 6f685229a34ad3b9ca2ac7ff1a51ba75468ba7d72bf5583e42167b6130444d50 state: rpc error: code = 2 desc = containerd: container not found" Mar 23 00:06:55 swarm-01 dockerd[46396]: time="2017-03-23T00:06:55.289762876Z" level=warning msg="libcontainerd: client is out of sync, restore was called on a fully synced container (6f685229a34ad3b9ca2ac7ff1a51ba75468ba7d72bf5583e42167b6130444d50)." Mar 23 00:06:55 swarm-01 dockerd[46396]: time="2017-03-23T00:06:55.290245334Z" level=debug msg="libcontainerd: received past event &types.Event{Type:\"start-container\", Id:\"6f685229a34ad3b9ca2ac7ff1a51ba75468ba7d72bf5583e42167b6130444d50\", Status:0x0, Pid:\"\", Timestamp:(*timestamp.Timestamp)(0xc4204763d0)}" Mar 23 00:06:55 swarm-01 dockerd[46396]: time="2017-03-23T00:06:55.290491511Z" level=debug msg="libcontainerd: received past event &types.Event{Type:\"exit\", Id:\"6f685229a34ad3b9ca2ac7ff1a51ba75468ba7d72bf5583e42167b6130444d50\", Status:0x2, Pid:\"init\", Timestamp:(*timestamp.Timestamp)(0xc4204764b0)}" Mar 23 00:06:55 swarm-01 dockerd[46396]: time="2017-03-23T00:06:55.290698808Z" level=warning msg="libcontainerd: failed to retrieve container 6f685229a34ad3b9ca2ac7ff1a51ba75468ba7d72bf5583e42167b6130444d50 state: rpc error: code = 2 desc = containerd: container not found" Mar 23 00:06:55 swarm-01 dockerd[46396]: time="2017-03-23T00:06:55.290906903Z" level=debug msg="plugin state changed 6f685229a34ad3b9ca2ac7ff1a51ba75468ba7d72bf5583e42167b6130444d50 libcontainerd.StateInfo{CommonStateInfo:libcontainerd.CommonStateInfo{State:\"exit\", Pid:0x0, ExitCode:0x2, ProcessID:\"\"}, OOMKilled:false}" Mar 23 00:06:55 swarm-01 dockerd[46396]: time="2017-03-23T00:06:55.402244407Z" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"start-container\", Id:\"6f685229a34ad3b9ca2ac7ff1a51ba75468ba7d72bf5583e42167b6130444d50\", Status:0x0, Pid:\"\", Timestamp:(*timestamp.Timestamp)(0xc420132660)}" Mar 23 00:06:55 swarm-01 dockerd[46396]: time="2017-03-23T00:06:55.403204327Z" level=debug msg="plugin state changed 6f685229a34ad3b9ca2ac7ff1a51ba75468ba7d72bf5583e42167b6130444d50 libcontainerd.StateInfo{CommonStateInfo:libcontainerd.CommonStateInfo{State:\"start-container\", Pid:0xb55f, ExitCode:0x0, ProcessID:\"\"}, OOMKilled:false}" Mar 23 00:06:55 swarm-01 dockerd[46396]: time="2017-03-23T00:06:55.404053102Z" level=debug msg="libcontainerd: event unhandled: type:\"start-container\" id:\"6f685229a34ad3b9ca2ac7ff1a51ba75468ba7d72bf5583e42167b6130444d50\" timestamp:<seconds:1490227615 nanos:396670795 > "

@mkporwit can you try specifying cloudstor:azure as the plugin in your service creation cmdline (instead of docker4x/cloudstor:azure-v17.03.0-ce)?

So something along the lines of: docker service create --replicas 1 --name pin1 --mount type=volume,volume-driver=cloudstor:azure,source=mysql-volume,destination=/shareddata --constraint 'node.id=l7bopju9jz3iw31uquckhr5pa' --name helloworld alpine ping google.com

When the alias is specified during installation, it seems that’s what needs to be referenced. We will fix the docs to address this if that is indeed what you are running into.

Ding ding ding! Give that man a cigar!

@ddebroy – looks like specifying the alias instead of the full docker plugin name gets the service into a running state:
hoylu@swarm-01:~$ docker service create --replicas 1 --mount type=volume,volume-driver=cloudstor:azure,source=mysql-volume,destination=/shareddata --constraint 'node.id==l7bopju9jz3iw31uquckhr5pa' --name helloworld alpine ping google.com serup8pk80mlfhvk7ewpetjxl hoylu@swarm-01:~$ docker service ls ID NAME MODE REPLICAS IMAGE serup8pk80ml helloworld replicated 1/1 alpine:latest hoylu@swarm-01:~$ docker service ps helloworld ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS sly6m4g58kwf helloworld.1 alpine:latest swarm-01 Running Running 5 seconds ago

And docker service logs helloworld now works as well. So, maybe there was a bug where docker serivce logs returns an error when the service is stuck in Pending?

Glad it’s working now and sorry about the inconvenience. What was the specific error you were seeing when trying docker service logs ?

This is the error I was getting from docker service logs when the services were stuck in Pending.

Ah I see. That could be a swarmkit bug. Please file an issue under: https://github.com/docker/swarmkit/issues and a swarmkit expert will take a look.