Docker Community Forums

Share and learn in the Docker community.

Node becomes unreachable after Agent Update

docker

(Alexagranov) #1

Hi there,

I’ve noticed for a couple of weeks now that a single AWS node launched from Docker Cloud becomes unreachable after a few days of successful uptime. AWS EC2 dashboard shows the instance in a healthy state the entire time.

Today upon inspecting Timeline for the node for the first time, I see this “Node Update (agent)” activity coinciding with when the node becomes unreachable:

Is this a known issue with a possible workaround?

Thank you,
-alex


(Cpmdock) #2

Just had the same thing happen to us. A node went down after an update. Anything for this?


(Stephen Pope) #3

Had the same thing happen to one of our nodes running ubuntu 16.04 LTS, we figured out that it was an incompatible version of the aufs driver and the updated kernel so, installing the extra packages listed here solved it for us.

But the question still remains, shouldn’t the dockercloud-agent daemon take care of this stuff? I thought that was one of the benefits of docker cloud vs just managing my own infrastructure.


(Jonathanparrilla) #4

This just happened to me as well. Unfortunately, this process is not perfect. It updated the agent on two nodes. On one of them, it worked just fine, while on the other it was botched and the services on that server stopped working properly.

This resulted in our production environment working at half capacity.

I wonder if Docker will have a fix for this or if we are left to roll the dice every time they do this.

Perhaps we can schedule the agent upgrade so if someone goes wrong we are not caught off guard?