Docker Community Forums

Share and learn in the Docker community.

Manager node fails while idle


(Jpringle11) #1

Expected behavior

After launching a stack and starting services, the manager node should remain up and stable while essentially idle.

Actual behavior

The manager node is recycled by the auto-scale-group after failing a health check, though no activity has been occurring on the manager node.

Additional Information

Docker for AWS 1.12.2-rc1 (beta6)

Launched a very simple stack consisting of 1 manager and 1 worker (both t2.small) and deployed a small set of services (2 nginx containers, 2 java containers, 1 couchbase container). Containers and swarm worked great on Monday and through the week. No activity over the weekend, but the ASG reported that the manager failed a healthcheck, so it terminated the instance and relaunched it. Only the “manager activities” were running on the manager node, so there is no reason why it should fail a healthcheck and be recycled by the ASG.

Steps to reproduce the behavior

  1. Launch stack with 1 manager and 1 worker of size t2.small
  2. Deploy some basic services
  3. Leave idle until the ASG for the manager recycles it (in my case it took 5 days)

I understand that 1 manager is not a production-level system, but I would at least expect it to stay up and stable if basically left alone.

Is there a way to determine why the node failed the ASG health check?


(Nathan Le Claire) #2

Would be great to get a docker-diagnose ID for this issue (only possible w/ newest AWS release unfortunately). I suspect a Docker daemon OOM since the t2.small is only about 2GB RAM and Java mem usage gets pretty steep fast IME.