Inception
Hello everyone, This article is part of The Swarm series, The knowledge in this series is built in sequence, Check out The Swarm series section above.
In the last article we covered How to Deploy a Stack to a swarm cluster and the value behind The Stack, And deployed a simple Nignx web-app, and MySQL database using Docker compose YAML file, Stack deploys command, And Play-with-docker lab.
Overview
In This article We will complete The Swarm tutorials from the stand-up point Which is the containers stack went through errors and cannot be started, So in this article will go through the way of troubleshooting step-by-step, and find out how to find the exact issue. in this lab Also will use the Play-with-docker lab.
Setup the environment
To start troubleshooting first we need to redeploy the last Stack, Or simply deploy the YAML file below on play-with-docker lab:
version: '3'
services: # services list
nginx: # service name
image: nginx:latest # specify an image with it's tag
ports: # defining ports
- "8080:80"
volumes: # mount volume disk | mount nginx.conf stored on local device to nginx container
- ./nginx.conf:/etc/nginx/nginx.conf
# establish connection to mysql container by mention the defined variables at mysql environment below
environment:
MYSQL_HOST: mysql
MYSQL_PORT: 3306
MYSQL_DATABASE: myapp
MYSQL_USER: root
MYSQL_PASSWORD: password
# The build of this container in depends on mysql
depends_on:
- mysql
mysql:
image: mysql:latest
volumes:
- ./data:/var/lib/mysql
# Define mysql variables
environment:
MYSQL_DATABASE: myapp
MYSQL_USER: root
MYSQL_PASSWORD: password
MYSQL_ROOT_PASSWORD: password
Deploy
docker stack deploy -c docker-stack-file.yaml myapp
Print-out the deploy service status
docker stack services myapp
Troubleshooting
When print-out the services of myapp stack you will find out that the replica table is 0/1 which means that the desired state of replica for this container service is set to 1 however the actual state is 0, Which means the container service isn’t deployed yet.
Docker stack will try to deploy these services all the time, however, it’s the same result, let’s figure out where is the issue.
Down we go
- Run the following To get the deployed stacks
docker stack ls
we have one stack called myapp with two services.
- Run the following to print-out the services of that stack
docker stack services myapp
As mentioned above, the actual state is 0, let’s find out why…
- Run the following to print-out the services with more info
docker stack ps myapp
Ok…Okay, Now let’s explain what we’re seeing here.
Let’s focus on the highlighted in yellow at red square area, At the first column is for container service ID, the second for the container service name, the fourth is for Node name that host this service, Next to it is Desired state column with Ready status that means the Swarm was trying to deploy on this host while rejected at the next column which is Current state, why is that happened let’s view the last column Error.
However the error column isn’t wide enough let’s expand it:
docker stack ps --no-trunc myapp
Yes, But There are too many services, let’s focus on Nginx service by running the following:
docker service ps --no-trunc myapp_nginx
Yeah, Now the error is clear enough “bind source path does not exist: /root/nginx.conf”
Actually we didn’t create an Nginx.conf file to mount it inside the container, Let’s fix this out.
- Create a very simple nginx.conf file at the same path of the YAML file:
vim nginx.conf
# past the below in it
worker_processes 1;
events {
worker_connections 1024;
}
http {
server {
listen 80;
server_name example.com;
root /var/www/html;
location / {
try_files $uri $uri/ =404;
}
}
}
- Update the stack by running the following:
docker stack deploy -c docker-stack.yaml myapp
- Yeah, Now the container service is running on manager node 2:
- Optional, ssh on manager 2 and run the following:
# ssh remote
ssh root@manager2
# print containers list
docker container ls
- Let’s make a simple somke-test by running curl:
curl localhost:8080
spotlight
Here we did troubleshooting on myapp_nginx service, if you do troubleshoot the myapp_mysql service will find-out it’s the same issue. However it more complex and there’s no time to discuses here.
The Issue that we faced here it was with failed to start-up the service, and we noticed that while print-out the service status using docker service ps --no-trunc myapp_nginx
command, What if the container service start successfully however, the app that hosted inside the container service -Nginx webapp in our case- have an issue that obstacle its progress, here you should go to the next layer of troubleshooting which is the container service logs, fetch the service logs by running the following:
docker service logs <service-name>
# for a specific task
docker service logs <service-name> --task <task-id>
# for interactive session
docker service logs --follow --tail 100 <service-name>
# or land on the hosted node and run:
docker container logs <container-id>
Another way, You can land-on the hosted node and remote on the container using docker exec
command and troubleshoot the logs.
That’s it, Very straightforward, very fast🚀.
Hope this article inspired you and will appreciate your feedback. Thank you.