Hi Docker experts,
I am having the problem that a large doker-compose.yml with lots of containers (~400) overwhelms single docker nodes when launched via docker stack deploy.
I was deploying the same docker-compose.yml file only on one node via docker compose up and used the depends_on attribute to have every subsequent container wait until the previous one was spawned. That worked perfectly since the node was not overwhelmed with spawning new containers.
docker stack deploy, however, seems to ignore this statement and tries to spawn all containers simultaneously, which is simply too much. Each container does not do much once it is spawned, which is why I only observed a peak load during the spawning process.
How do I get the docker stack deploy to deploy containers linerarly or at least in batches such that my nodes don´t crash?
I believe 400 containers are to many to define in a single compose file or swarm stack file. Do they all really depend on each other or you just used
depends_on to avoid deploying them simultaneously?
I would use multiple yaml files (compose or stack) to deploy fewer containers simultaneously and start to deploy the next. You could create a script with some delay or check the previous stack’s state before deploying the next one.
depends_on, I think Docker swarm should not ignore it. I don’t use swarm, but the documentation contains examples using
depends_on with swarm ant there are recent issues indicating that people are using it for Swarm deployments so maybe it just works differently.
Note that even Kubernetes doesn’t have a feature like that natively without operators and people are complaining about that for the same reason as you. If I remember well, Kubernetes has some new ideas to solve that in the future like being able to change the resource requirements after starting the containers, but it is still an issue.
So again, even if
depends_on should work, I would split up that single yaml file into mutliple smaller yamls. It would be easier to read and understand the files too.
It was always misleading that the same compose specification was used to describe compose and swarm functionality.
depends_on was never implemented for swarm stack deployments.
That´s what I thought. depends_on is ignored in a swarm setup.
My solution for now is to spawn containers manually (scripted) via the service create command. It is not as neat as the stack deploy, but allows for more control since I can throttle deployment within my wrapper script. What I also do to obtain a static IP per container is to first connect to the overlay network with the service create command, then SSH into the node that actually spawned the container and disconnect / reconnect from the overlay with the --ip option of network connect. This way, one is able to set a predefined IP for a container.