NGINX swarm redeploy timeouts

Im getting strange behavior when restarting or updating my docker-compose stack. The first time I init a swarm and deploy a stack I am able to hit the NGINX proxy and be greeted with my application messages. Now if i tear down the service:

docker stack rm hello

I see there a no containers. But running the same command again:

docker stack deploy -c helloworld/docker-compose.yml hello

I can reach my application on port 8080 fine but the NGINX server just times out despite the replica set stating its deployed.

This is my docker compose file:

services:
 web:
   image: helloworld_web:latest
   deploy:
    replicas: 3
    resources:
      limits:
        cpus: "0.1"
        memory: 50M
    restart_policy:
        condition: on-failure
   ports:
    - "8080:8080" 
   networks:
    - webnet
   command: gunicorn -b :8080 app:application
 
 nginx:
   image: helloworld_nginx:latest
   deploy:
    replicas: 1
    restart_policy:
        condition: on-failure
   ports:
     - "80:80"
   networks:
     - webnet
   depends_on:
     - web

networks:
  webnet:

It seems the only way to get it working again is to leave the swarm and restart the docker service.

Is there a simple command im missing that will fix this issue? Its worth noting that the same compose file with docker-compose up works fine with restarts every time (with the warnings of deploy being ignored when not in a swarm)

Hi, what Docker “product” and version are you running? (Docker CE?, Docker for Windows? Docker for Mac?, Docker for Linux?)

Also you might want to share your nginx configuration as well.

I would bet my money on “RFC 5861 aka. dns-caching” trouble due to static proxy_pass declarations.

2 Likes

Quick responses :smiley:

Its Docker CE for Linux (Ubuntu)

nginx.conf:

# Define the user that will own and run the Nginx server
user  nginx;
# Define the number of worker processes; recommended value is the number of
# cores that are being used by your server
worker_processes  1;
 
# Define the location on the file system of the error log, plus the minimum
# severity to log messages for
error_log  /var/log/nginx/error.log warn;
# Define the file that will store the process ID of the main NGINX process
pid        /var/run/nginx.pid;
 
 
# events block defines the parameters that affect connection processing.
events {
   # Define the maximum number of simultaneous connections that can be opened by a worker process
   worker_connections  1024;
}
 
 
# http block defines the parameters for how NGINX should handle HTTP web traffic
http {
   # Include the file defining the list of file types that are supported by NGINX
   include       /etc/nginx/mime.types;
   # Define the default file type that is returned to the user
   default_type  application/json;
 
   # Define the format of log messages.
   log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                     '$status $body_bytes_sent "$http_referer" '
                     '"$http_user_agent" "$http_x_forwarded_for"';
 
   # Define the location of the log of access attempts to NGINX
   access_log  /var/log/nginx/access.log  main;
 
   # Define the parameters to optimize the delivery of static content
   sendfile        on;
   tcp_nopush     on;
   tcp_nodelay    on;
 
   # Define the timeout value for keep-alive connections with the client
   keepalive_timeout  65;
 
   # Define the usage of the gzip compression algorithm to reduce the amount of data to transmit
   #gzip  on;
 
   # Include additional parameters for virtual host(s)/server(s)
   include /etc/nginx/conf.d/*.conf;
}

server.conf:

# Define the parameters for a specific virtual host/server
server {
   # Define the directory where the contents being requested are stored
   # root /usr/src/app/project/;
 
   # Define the default page that will be served If no page was requested
   # (ie. if www.kennedyfamilyrecipes.com is requested)
   # index index.html;
 
   # Define the server name, IP address, and/or port of the server
   listen 80;
   server_name api.helloworld.com
 
   # Define the specified charset to the “Content-Type” response header field
   charset utf-8;
 
   # Configure NGINX to reverse proxy HTTP requests to the upstream server (Gunicorn (WSGI server))
   location / {
       # Define the location of the proxy server to send the request to
       proxy_pass http://web:8080/;
 
       # Redefine the header fields that NGINX sends to the upstream server
       proxy_set_header Host $host;
       proxy_set_header X-Real-IP $remote_addr;
       proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
 
       # Define the maximum file size on file uploads
       client_max_body_size 5M;
   }
}```

I am not sure if i got the RFC right, but this pretty much seems like a dns cacheing issue:

Try to modify you server.conf like this.

server {
       ...
        # set DNS resolver as Docker internal DNS
        resolver 127.0.0.11 valid=10s;
        resolver_timeout 5s; 
       ...
       location / {
        # prevent dns caching and force nginx to make a dns lookup on each request.
        set $target http://web:8080;
        proxy_pass $target;
       ..
       }
}

The first part declares to use the swarm networks dns server with a short cache validation time.
The second part uses a indirect proxy_pass to prevent caching for the proxy_pass target at all.

3 Likes

It was this… simply amazing response.

Edit: Although it now fails to start up again once removed from the swarm. Update works though.

Sometimes it takes a couple of seconds until it gets the updates container ips…
The update time gap is close enough to be considered perfect :slight_smile:

Ya, i get that sometimes as well, wenn the stack is removed and immidialty deployed again. It helps to wait a second or two.

We started to externalize reverse proxy containers and put them in their own stacks. They are part of the same externaly created network as the set of backend stacks they serve the data to. Usualy we only update the backend stacks and rarely touch the reverse proxy stacks.

All working now, thanks