Description
I have set up 5 docker services on two nodes, based on microsoft/windowsservercore:1709, which are principally working OK.
docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
1kkrdzrsjkh3 admin-vz-next-8501 replicated 1/1 qontiscommoncontainers.azurecr.io/qontis/vega-admin:1709-latest
8yv0mi3zcsk2 frontend-vz-next-9000 replicated 1/1 qontisvzcontainers.azurecr.io/qontis/vega-frontend-vz:1709-latest
fgo57964aslj frontend-vz-next-9001 replicated 1/1 qontisvzcontainers.azurecr.io/qontis/vega-frontend-vz:1709-latest
rr5akqkgmaky loadbalancer-vz-next-8001 replicated 1/1 qontiscommoncontainers.azurecr.io/qontis/vega-loadbalancer:1709-latest
s8zsbh9snrcq backend-vz-next replicated 2/2 qontiscommoncontainers.azurecr.io/qontis/vega-backend:1709-latest
however, after restarting the computer, the services which are hosted on the restarted computer won’t replicate:
docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
1kkrdzrsjkh3 admin-vz-next-8501 replicated 0/1 qontiscommoncontainers.azurecr.io/qontis/vega-admin:1709-latest
8yv0mi3zcsk2 frontend-vz-next-9000 replicated 1/1 qontisvzcontainers.azurecr.io/qontis/vega-frontend-vz:1709-latest
fgo57964aslj frontend-vz-next-9001 replicated 1/1 qontisvzcontainers.azurecr.io/qontis/vega-frontend-vz:1709-latest
rr5akqkgmaky loadbalancer-vz-next-8001 replicated 0/1 qontiscommoncontainers.azurecr.io/qontis/vega-loadbalancer:1709-latest
s8zsbh9snrcq backend-vz-next replicated 0/2 qontiscommoncontainers.azurecr.io/qontis/vega-backend:1709-latest
In the event log, I see lots of errors like this:
failed to deactivate service binding for container backend-vz-next.2.lqdliv4nfc3cw5jbegln4bajz
module=node/agent node.id=vmt7ji8ro8ky028hq52s3cmuq error=No such container: backend-vz-next.2.lqdliv4nfc3cw5jbegln4bajz
fatal task error
node.id=vmt7ji8ro8ky028hq52s3cmuq task.id=l523y156qrdjx6rafi31m9bxs error=HNS failed with error : A network with this name already exists. service.id=s8zsbh9snrcqz6a8d9b6eu4r4 module=node/agent/taskmanager
What helps is completely removing and reinstalling all the services two times, including the docker overlay network. After doing it the first time, we get the following errors in the event log:
task.id=qgybil1cg6hp61dg8cydfd3ly service.id=2qq3zyhd0gjq1ycm3jakpnebd error=HNS failed with error : The parameter is incorrect. module=node/agent/taskmanager node.id=vmt7ji8ro8ky028hq52s3cmuq`
The second time then usually works. If not, we do it a third /fourth time, until it finally works. But we cannot keep doing this after every restart.
Any advice would be helpful, since we intend to go live with this environment soon.
Additional information you deem important (e.g. issue happens only occasionally):
Output of docker version
:
Client:
Version: 17.06.2-ee-15
API version: 1.30
Go version: go1.8.7
Git commit: 64ddfa6
Built: Mon Jul 9 23:33:36 2018
OS/Arch: windows/amd64
Server:
Engine:
Version: 17.06.2-ee-15
API version: 1.30 (minimum version 1.24)
Go version: go1.8.7
Git commit: 64ddfa6
Built: Mon Jul 9 23:45:29 2018
OS/Arch: windows/amd64
Experimental: false
Output of docker info
:
Containers: 6
Running: 0
Paused: 0
Stopped: 6
Images: 7
Server Version: 17.06.2-ee-15
Storage Driver: windowsfilter
Windows:
Logging Driver: json-file
Plugins:
Volume: local
Network: l2bridge l2tunnel nat null overlay transparent
Log: awslogs etwlogs fluentd json-file logentries splunk syslog
Swarm: active
NodeID: vmt7ji8ro8ky028hq52s3cmuq
Is Manager: true
ClusterID: 5h1zhy7h2zvkmwderln9bw0w0
Managers: 1
Nodes: 2
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Root Rotation In Progress: false
Node Address: 88.198.8.115
Manager Addresses:
88.198.8.115:2377
Default Isolation: process
Kernel Version: 10.0 16299 (16299.431.amd64fre.rs3_release_svc_escrow.180502-1908)
Operating System: Windows Server Datacenter
OSType: windows
Architecture: x86_64
CPUs: 8
Total Memory: 63.79GiB
Name: HETZNER-S002
ID: KW6X:DVU6:32HG:ZJ3C:C77C:RT4J:4MZO:6W2K:6LGH:HRO7:N7CA:7GQW
Docker Root Dir: C:\ProgramData\docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Additional environment details (AWS, VirtualBox, physical, etc.):
Windows Server Core, Version 1709, with updates KB4339420 and KB4343897