Swarm Timeouts Intermittent

Hey guys,

Running into a pretty big issue on our setup.

We have a 4 node swarm running with NGINX reverse proxy and a simple web application.

For a few months now this has been running without a hitch but recently we began experiencing timouts, 503, 504 and 302 errors on a regular basis. There are no apparent issues shown in the logs, firewall rules are good, the stack has been redeployed multiple times and issues persist.

I have a feeling it is related to the overlay network?

Not sure what else to look into.

Also note that we have the same setup on a dev swarm and have not had this issue.

Docker-stack-compose.yml:

version: '3'

services:
  imageapi:
    image: git.*****/docker/retrieveimageapi
    deploy:
      replicas: 2
      update_config:
        parallelism: 1
        delay: 10s
        monitor: 15s
  proxy:
    image: git.*****/docker/apiproxy
    ports:
      - "80:80"
    deploy:
      mode: global
      update_config:
        parallelism: 1
        delay: 10s
        monitor: 15s

Docker version:

Client:
 Version:      17.06.0-ce
 API version:  1.30
 Go version:   go1.8.3
 Git commit:   02c1d87
 Built:        Fri Jun 23 21:23:31 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.06.0-ce
 API version:  1.30 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   02c1d87
 Built:        Fri Jun 23 21:19:04 2017
 OS/Arch:      linux/amd64
 Experimental: false