We need to deploy a docker stack in a CentOS VM. We have a docker compose file to launch a stack with two services and one container in each of the services. One of these containers connects to two external networks.
The docker-compose.yml looks like this:
version: ‘3’
services:
GoOn_db:
image: postgres
GoOn_web:
image: sshweb_5:new
command: bash start.sh
volumes:
- .:/code
ports:
- “8000:8000”
- “8022:22”
networks:
- external_oam_network
- external_data_network
depends_on:
- GoOn_db
networks:
external_oam_network:
external:
name: goon__oam
external_data_network:
external:
name: goon__data
The external networks are swarm scoped macvlan networks created using below commands:
docker network create --config-only --subnet 172.28.128.0/24 --gateway 172.28.128.1 -o parent=eth1 --ip-range 172.28.128.32/27 __goon__data
docker network create -d macvlan --scope swarm --config-from __goon__data goon__data
The docker stack is created using below command:
docker stack deploy --compose-file docker-compose.yml app
The Issue:
With the above configuration, docker stack comes up perfectly the first time. But, if the hosting VM goes for a reboot [or crashes and comes up again], the container that is connected to the external networks (GoOn_web service), fails to come up. Following are the errors seen in journallogs.
Jun 13 15:00:14 localhost.localdomain dockerd[21817]: time=“2018-06-13T15:00:14.393543091+05:30” level=error msg=“fatal task error” error=“network dm-g3ovik5qx6br is already using parent interface goon__data” module=node/agent/taskmanager node.id=enqfccpf6sn28l01f6i6grq6h service.id=6w743aksizz5b6p7u3xgqpet8 task.id=wlxpngi6faw571nkgm4f9p0c9
Jun 13 15:00:14 localhost.localdomain dockerd[21817]: time=“2018-06-13T15:00:14.824590521+05:30” level=warning msg=“failed to deactivate service binding for container app_GoOn_web.1.y7l2c88wrhfq54f5af4d0qio7” error=“No such container: app_GoOn_web.1.y7l2c88wrhfq54f5af4d0qio7” module=node/agent node.id=enqfccpf6sn28l01f6i6grq6h
Jun 13 15:00:16 localhost.localdomain dockerd[21817]: time=“2018-06-13T15:00:16.827271962+05:30” level=error msg=“network goon__data remove failed: network goon__data not found” module=node/agent node.id=enqfccpf6sn28l01f6i6grq6h
Jun 13 15:00:16 localhost.localdomain dockerd[21817]: time=“2018-06-13T15:00:16.827406882+05:30” level=error msg=“remove task failed” error=“network goon__data not found” module=node/agent node.id=enqfccpf6sn28l01f6i6grq6h task.id=y7l2c88wrhfq54f5af4d0qio7
The other issue observed is that there is no way to clean up the network along with its config completely. The following commands were tried:
[localhost config_drive]# docker stack rm app
Removing service app_GoOn_db
Removing service app_GoOn_web
Removing network app_default
[localhost config_drive]# docker network rm goon__data
goon__data
[localhost config_drive]# docker network rm __goon__data
Error response from daemon: configuration network “__goon__data” is in use
It seems like the network cleanup has some issue as well.
Please let us know if there are any workarounds for this issue or if our configuration needs some tweaking.
Possibly related issues found in github:
https://github.com/docker/libnetwork/issues/1743
Cannot remove network due to active endpoint, but cannot stop/remove containers · Issue #23302 · moby/moby · GitHub
The following are the docker command outputs [after hosting VM reboot]:
[localhost config_drive]# docker version
Client:
Version: 17.12.0-ce
API version: 1.35
Go version: go1.9.2
Git commit: c97c6d6
Built: Wed Dec 27 20:10:14 2017
OS/Arch: linux/amd64
Server:
Engine:
Version: 17.12.0-ce
API version: 1.35 (minimum version 1.12)
Go version: go1.9.2
Git commit: c97c6d6
Built: Wed Dec 27 20:12:46 2017
OS/Arch: linux/amd64
Experimental: false
[root@localhost config_drive]# docker network ls
NETWORK ID NAME DRIVER SCOPE
1456a3ec482a __goon__data null local
c637316e8a95 __goon__oam null local
395a87391443 bridge bridge local
67f95713ee03 docker_gwbridge bridge local
ut4qii3qdzrs goon__data macvlan swarm
88zfsq41n7xo goon__oam macvlan swarm
803609448d35 host host local
wrypoj5x9fxx ingress overlay swarm
095dc2ca9729 none null local
[localhost config_drive]# docker stack ls
NAME SERVICES
app 2
[localhost config_drive]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
c4kpzmc26qgk app_GoOn_db replicated 1/1 postgres:latest
6w743aksizz5 app_GoOn_web replicated 0/1 sshweb_5:new *:8000->8000/tcp,*:8022->22/tcp
[localhost config_drive]# docker ps --all
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ce820c3596f0 postgres:latest "docker-entrypoint.s…" 9 minutes ago Up 9 minutes 5432/tcp app_GoOn_db.1.o000vpvfer0moz9b1xnh05wp5