DockerSwarm mode with Postgres, fail only with persistent storage

Hi
I use the following .yml to deploy a Postgres service.

version: “3”

services:
postgres:
image: postgres:9.5
volumes:
- db-data:/var/lib/postgresql/data
networks:
- postgres
deploy:
placement:
constraints: [node.role == worker]

networks:
postgres:

volumes:
db-data:
driver: "vmdk"
driver_opts:
size: “500MB”

What I found is the task is scheduled on the worker-node as expected, but only last for several seconds, and then the container shutdown and Swarm restart another container on the same node. Any ideas?

I also try to deploy with local volumes, and it works fine. I don’t know why it works with local volume.

Output from Master node:

root@esx1-swarm01:~# docker service ls
ID NAME MODE REPLICAS IMAGE
sp4ec5qyixqh postgres_postgres replicated 0/1 postgres:9.5
root@esx1-swarm01:~# docker service ps postgres_postgres
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
frhvciz1hokz postgres_postgres.1 postgres:9.5 esx1-swarm02 Running Starting less than a second ago
ak2ak21i1zdi _ postgres_postgres.1 postgres:9.5 esx1-swarm02 Shutdown Failed 5 seconds ago "task: non-zero exit (1)"
q6240vlajwn3 _ postgres_postgres.1 postgres:9.5 esx1-swarm02 Shutdown Failed 14 seconds ago "task: non-zero exit (1)"
j3y42ji0la5q _ postgres_postgres.1 postgres:9.5 esx1-swarm02 Shutdown Failed 23 seconds ago "task: non-zero exit (1)"
jxwjfw71kmnk _ postgres_postgres.1 postgres:9.5 esx1-swarm02 Shutdown Failed 34 seconds ago “task: non-zero exit (1)”

Output put from the worker node which the task is scheduled:
root@esx1-swarm01:~# docker service ls
ID NAME MODE REPLICAS IMAGE
sp4ec5qyixqh postgres_postgres replicated 0/1 postgres:9.5
root@esx1-swarm01:~# docker service ps postgres_postgres
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
frhvciz1hokz postgres_postgres.1 postgres:9.5 esx1-swarm02 Running Starting less than a second ago
ak2ak21i1zdi _ postgres_postgres.1 postgres:9.5 esx1-swarm02 Shutdown Failed 5 seconds ago "task: non-zero exit (1)"
q6240vlajwn3 _ postgres_postgres.1 postgres:9.5 esx1-swarm02 Shutdown Failed 14 seconds ago "task: non-zero exit (1)"
j3y42ji0la5q _ postgres_postgres.1 postgres:9.5 esx1-swarm02 Shutdown Failed 23 seconds ago "task: non-zero exit (1)"
jxwjfw71kmnk _ postgres_postgres.1 postgres:9.5 esx1-swarm02 Shutdown Failed 34 seconds ago “task: non-zero exit (1)”

I saw a lot of error like “task:non-zero exit (1)”, but how can I know the detailed reason why the task is failed on the worker node?

Thanks!

I tried to create the service from the command line, got the same error.

root@esx1-swarm01:~# docker service create --name=postgres --mount type=volume,src=postgres_vol1,dst=/var/lib/postgresql/data,volume-driver=vmdk --constraint node.role==worker postgres:9.5
9jst2lsdoysgqixx8fxyn0h0w
root@esx1-swarm01:~# docker service ls
ID NAME MODE REPLICAS IMAGE
9jst2lsdoysg postgres replicated 0/1 postgres:9.5
root@esx1-swarm01:~# docker service ps postgres
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
4mcsmlkgp697 postgres.1 postgres:9.5 esx1-swarm02 Ready Preparing 1 second ago
q290l798jqw9 _ postgres.1 postgres:9.5 esx1-swarm02 Shutdown Failed 2 seconds ago “task: non-zero exit (1)”

I’m having the same issue. Simple two node swarm. Postgres image will start fine as part of a service if I use local volume. As soon as the volume is external (even just mapped to host), I get the non-zero exit problem. If not running in swarm and I simply deploy using docker compose, no problem at all.