FROM python:latest
WORKDIR /app
COPY requirements.txt ./
RUN apt-get update && apt-get install -y default-mysql-client
RUN pip install --no-cache-dir -r requirements.txt
CMD [ "python", "app.py" ]
app.py is a failrly simple lightweight script with no CPU or memory usage, running 24/7. It produces a lot of logs into container as well as writing into the MySQL database. After approximately 15-18 hours, it stops saving logs on around line #47,000 (I can see a last timestamp), it then continue working and saving data to the database for about one more hour and then it stops - no logs (except 47k of successfull runs history), no new data saved to the DB.
When python execution suddently stops, I can attach to the container, see files there, etc. But I cannot stop it - “docker compose stop” keeps counting seconds indefinately without stopping it. The only solution at this point is to do “systemctl restart docker.socket docker.service” which then keeps it going for another 15 or so hours.
Have you checked the daemon logs? journalctl -xu docker.service should give some more insights.
Apart from that, I would suggest getting rid of tty: true as it keeps the container open, even though the foreground process might be terminated.
I would also recommend configuring your application logs to log directly to the console or stdout/stderr, this way the logs are stored on the host, outside the container filesystem.
Furthermore, please share the output of docker version and docker info.
Have you checked the daemon logs? journalctl -xu docker.service should give some more insights.
I have not, no and I will do next time it occures.
I would also recommend configuring your application logs to log directly to the console or stdout/stderr, this way the logs are stored on the host, outside the container filesystem.
This is what I have. When I say logs, I mean the actual dockers logs which are saved in json file. I am looking at them.
Furthermore, please share the output of docker version and docker info .
Docker version:
Client: Docker Engine - Community
Version: 24.0.1
API version: 1.43
Go version: go1.20.4
Git commit: 6802122
Built: Fri May 19 18:06:21 2023
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 24.0.1
API version: 1.43 (minimum version 1.12)
Go version: go1.20.4
Git commit: 463850e
Built: Fri May 19 18:06:21 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.21
GitCommit: 3dce8eb055cbb6872793272b4f20ed16117344f8
runc:
Version: 1.1.7
GitCommit: v1.1.7-0-g860f061
docker-init:
Version: 0.19.0
GitCommit: de40ad0
As an additional info, exactly same symptoms were happening with unmodified Selenium Hub container. App itself is not responding, nodes not connecting, can’t be stopped unless docker is restarted, but I could attach to it and see filesystem. So I assume it should not be my python application, but rather some incompatability/configuration of docker itself.
Since 24.0 in general is rather new, it’s worth checking whether there is already an active issue for this https://github.com/moby/moby/issues
The output of docker info looks good to me. We can see that the overlay2 storage driver is used and everything else looks pretty much like on every other Ubuntu 22.04 system.
Next time you have the issue, you might want to check the system load as well (e.g. with top, htop or even uptime). The numbers are average load during the last minute, the last fife minutes and the last fifteen minutes. With 4 cpu’s your system can handle load up to 4.0, everything beyond that point will cause system behavior to slow down. This is just a shot in the dark, but I feel it is worth mentioning it.
Thanks for your help. I actully updated to 24 from 23 hoping my issue is resolved, but it did not help. As a temp solution I will finish my script every few hours with “restart: always” policy.
Have you tried if it makes a difference if tty: true is removed? Of course, you need to make sure your Python must be running as foreground process to keep the container running.
Some people use tty: true as a workaround because their application is not kept running. Though, it might interfere with the container lifecycle.
You still might want to raise an issue, as the behavior you experience is unexpected.