Docker Community Forums

Share and learn in the Docker community.

HNS failed with error "A network with this name already exists" after restart computer on Windows Server, Version 1709


(Qontisurs) #1

Description

I have set up 5 docker services on two nodes, based on microsoft/windowsservercore:1709, which are principally working OK.

docker service ls
ID                  NAME                        MODE                REPLICAS            IMAGE                                                                    PORTS
1kkrdzrsjkh3        admin-vz-next-8501          replicated          1/1                 qontiscommoncontainers.azurecr.io/qontis/vega-admin:1709-latest
8yv0mi3zcsk2        frontend-vz-next-9000       replicated          1/1                 qontisvzcontainers.azurecr.io/qontis/vega-frontend-vz:1709-latest
fgo57964aslj        frontend-vz-next-9001       replicated          1/1                 qontisvzcontainers.azurecr.io/qontis/vega-frontend-vz:1709-latest
rr5akqkgmaky        loadbalancer-vz-next-8001   replicated          1/1                 qontiscommoncontainers.azurecr.io/qontis/vega-loadbalancer:1709-latest
s8zsbh9snrcq        backend-vz-next             replicated          2/2                 qontiscommoncontainers.azurecr.io/qontis/vega-backend:1709-latest

however, after restarting the computer, the services which are hosted on the restarted computer won’t replicate:

docker service ls
ID                  NAME                        MODE                REPLICAS            IMAGE                                                                    PORTS
1kkrdzrsjkh3        admin-vz-next-8501          replicated          0/1                 qontiscommoncontainers.azurecr.io/qontis/vega-admin:1709-latest
8yv0mi3zcsk2        frontend-vz-next-9000       replicated          1/1                 qontisvzcontainers.azurecr.io/qontis/vega-frontend-vz:1709-latest
fgo57964aslj        frontend-vz-next-9001       replicated          1/1                 qontisvzcontainers.azurecr.io/qontis/vega-frontend-vz:1709-latest
rr5akqkgmaky        loadbalancer-vz-next-8001   replicated          0/1                 qontiscommoncontainers.azurecr.io/qontis/vega-loadbalancer:1709-latest
s8zsbh9snrcq        backend-vz-next             replicated          0/2                 qontiscommoncontainers.azurecr.io/qontis/vega-backend:1709-latest

In the event log, I see lots of errors like this:

failed to deactivate service binding for container backend-vz-next.2.lqdliv4nfc3cw5jbegln4bajz
module=node/agent node.id=vmt7ji8ro8ky028hq52s3cmuq error=No such container: backend-vz-next.2.lqdliv4nfc3cw5jbegln4bajz

fatal task error
node.id=vmt7ji8ro8ky028hq52s3cmuq task.id=l523y156qrdjx6rafi31m9bxs error=HNS failed with error : A network with this name already exists.  service.id=s8zsbh9snrcqz6a8d9b6eu4r4 module=node/agent/taskmanager

What helps is completely removing and reinstalling all the services two times, including the docker overlay network. After doing it the first time, we get the following errors in the event log:

task.id=qgybil1cg6hp61dg8cydfd3ly service.id=2qq3zyhd0gjq1ycm3jakpnebd error=HNS failed with error : The parameter is incorrect.  module=node/agent/taskmanager node.id=vmt7ji8ro8ky028hq52s3cmuq`

The second time then usually works. If not, we do it a third /fourth time, until it finally works. But we cannot keep doing this after every restart.

Any advice would be helpful, since we intend to go live with this environment soon.

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

Client:
 Version:      17.06.2-ee-15
 API version:  1.30
 Go version:   go1.8.7
 Git commit:   64ddfa6
 Built:        Mon Jul  9 23:33:36 2018
 OS/Arch:      windows/amd64

Server:
 Engine:
  Version:      17.06.2-ee-15
  API version:  1.30 (minimum version 1.24)
  Go version:   go1.8.7
  Git commit:   64ddfa6
  Built:        Mon Jul  9 23:45:29 2018
  OS/Arch:      windows/amd64
  Experimental: false

Output of docker info:

Containers: 6
 Running: 0
 Paused: 0
 Stopped: 6
Images: 7
Server Version: 17.06.2-ee-15
Storage Driver: windowsfilter
 Windows:
Logging Driver: json-file
Plugins:
 Volume: local
 Network: l2bridge l2tunnel nat null overlay transparent
 Log: awslogs etwlogs fluentd json-file logentries splunk syslog
Swarm: active
 NodeID: vmt7ji8ro8ky028hq52s3cmuq
 Is Manager: true
 ClusterID: 5h1zhy7h2zvkmwderln9bw0w0
 Managers: 1
 Nodes: 2
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 10
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Root Rotation In Progress: false
 Node Address: 88.198.8.115
 Manager Addresses:
  88.198.8.115:2377
Default Isolation: process
Kernel Version: 10.0 16299 (16299.431.amd64fre.rs3_release_svc_escrow.180502-1908)
Operating System: Windows Server Datacenter
OSType: windows
Architecture: x86_64
CPUs: 8
Total Memory: 63.79GiB
Name: HETZNER-S002
ID: KW6X:DVU6:32HG:ZJ3C:C77C:RT4J:4MZO:6W2K:6LGH:HRO7:N7CA:7GQW
Docker Root Dir: C:\ProgramData\docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):

Windows Server Core, Version 1709, with updates KB4339420 and KB4343897