Hi all,
We have built a distributed worker system using Docker-compose. Jobs read a lot of data from PostgreSQL in our compamy network. RabbitMQ works as a job queue.
The problem we have is that suddenly one worker reports a connection timeout to db. Then all workers stop working and I can no longer access RabbitMQ manage portal on mapped port at localhost:8080.
Has the whole docker network chrashed, since an error in one worker shouldn’t affect to other workers and to RabbitMQ?
Have you any idea what happend?
How can I debug this scenario? I need little bit help figuring out what has happened here.
So I can no access localhost:8080 where manage portal should be. This normally works.
RabbitMq image is running since docker exec tools_rabbitmq_1 rabbitmqctl list_queues
works and running ping in worker conteiners to mabbitmq container responds correctly.
In the PostgreSQL logs I can see following errors:
- incomplete startup packet
- could not receive data from client: An existing connection was forcibly closed by the remote host.
Currently docker runs on my Windows 10 laptop. I have latest versions of docker:
- Docker version 17.03.1-ce, build c6d412e
- Docker-compose version 1.11.2, build f963d76f
Thank you for any help in advance!