Docker Swarm Series: #5th Troubleshooting



Inception

Hello everyone, This article is part of The Swarm series, The knowledge in this series is built in sequence, Check out The Swarm series section above.

In the last article we covered How to Deploy a Stack to a swarm cluster and the value behind The Stack, And deployed a simple Nignx web-app, and MySQL database using Docker compose YAML file, Stack deploys command, And Play-with-docker lab.




Overview

In This article We will complete The Swarm tutorials from the stand-up point Which is the containers stack went through errors and cannot be started, So in this article will go through the way of troubleshooting step-by-step, and find out how to find the exact issue. in this lab Also will use the Play-with-docker lab.




Setup the environment

To start troubleshooting first we need to redeploy the last Stack, Or simply deploy the YAML file below on play-with-docker lab:

version: '3'

services:  # services list
  nginx:  # service name
    image: nginx:latest  # specify an image with it's tag
    ports:  # defining ports
      - "8080:80"
    volumes:  # mount volume disk | mount nginx.conf stored on local device to nginx container
      - ./nginx.conf:/etc/nginx/nginx.conf

    # establish connection to mysql container by mention the defined variables at mysql environment below
    environment:
      MYSQL_HOST: mysql
      MYSQL_PORT: 3306
      MYSQL_DATABASE: myapp
      MYSQL_USER: root
      MYSQL_PASSWORD: password

    # The build of this container in depends on mysql
    depends_on:
      - mysql

  mysql:
    image: mysql:latest
    volumes:
      - ./data:/var/lib/mysql

   # Define mysql variables
    environment:
      MYSQL_DATABASE: myapp
      MYSQL_USER: root
      MYSQL_PASSWORD: password
      MYSQL_ROOT_PASSWORD: password

Deploy

docker stack deploy -c docker-stack-file.yaml myapp

Print-out the deploy service status

docker stack services myapp






Troubleshooting

When print-out the services of myapp stack you will find out that the replica table is 0/1 which means that the desired state of replica for this container service is set to 1 however the actual state is 0, Which means the container service isn’t deployed yet.

Docker stack will try to deploy these services all the time, however, it’s the same result, let’s figure out where is the issue.




Down we go



  • Run the following To get the deployed stacks
docker stack ls

Image description

we have one stack called myapp with two services.




  • Run the following to print-out the services of that stack
docker stack services myapp


As mentioned above, the actual state is 0, let’s find out why




  • Run the following to print-out the services with more info
docker stack ps myapp







Image description

Ok…Okay, Now let’s explain what we’re seeing here.

Let’s focus on the highlighted in yellow at red square area, At the first column is for container service ID, the second for the container service name, the fourth is for Node name that host this service, Next to it is Desired state column with Ready status that means the Swarm was trying to deploy on this host while rejected at the next column which is Current state, why is that happened let’s view the last column Error.




However the error column isn’t wide enough let’s expand it:

docker stack ps --no-trunc myapp









Yes, But There are too many services, let’s focus on Nginx service by running the following:

docker service ps --no-trunc myapp_nginx

Yeah, Now the error is clear enough “bind source path does not exist: /root/nginx.conf”

Actually we didn’t create an Nginx.conf file to mount it inside the container, Let’s fix this out.

  • Create a very simple nginx.conf file at the same path of the YAML file:
vim nginx.conf

# past the below in it
worker_processes 1;

events {
  worker_connections 1024;
}

http {
  server {
    listen 80;
    server_name example.com;
    root /var/www/html;

    location / {
      try_files $uri $uri/ =404;
    }
  }
}



  • Update the stack by running the following:
docker stack deploy -c docker-stack.yaml myapp

Image description

  • Yeah, Now the container service is running on manager node 2:


Image description

  • Optional, ssh on manager 2 and run the following:
# ssh remote
ssh root@manager2

# print containers list
docker container ls


  • Let’s make a simple somke-test by running curl:
curl localhost:8080

Image description






spotlight

Image description

Here we did troubleshooting on myapp_nginx service, if you do troubleshoot the myapp_mysql service will find-out it’s the same issue. However it more complex and there’s no time to discuses here.

The Issue that we faced here it was with failed to start-up the service, and we noticed that while print-out the service status using docker service ps --no-trunc myapp_nginx command, What if the container service start successfully however, the app that hosted inside the container service -Nginx webapp in our case- have an issue that obstacle its progress, here you should go to the next layer of troubleshooting which is the container service logs, fetch the service logs by running the following:

docker service logs <service-name>

# for a specific task
docker service logs <service-name> --task <task-id>

# for interactive session
docker service logs --follow --tail 100 <service-name>

# or land on the hosted node and run:
docker container logs <container-id>

Another way, You can land-on the hosted node and remote on the container using docker exec command and troubleshoot the logs.




That’s it, Very straightforward, very fast🚀.
Hope this article inspired you and will appreciate your feedback. Thank you.