After few hours Python in container stops writing logs and after some more execution of python finishes

luckylemon33 · May 24, 2023, 8:11pm

I’ve been dealing with my issue for a few days now. Host machine is Ubuntu 22.04. Docker compose file:

> Blockquote
version: '3'

services:

  service01:
    build: ./folder
    tty: true
    volumes:
      - ./app:/app

    extra_hosts:
    - "host.docker.internal:host-gateway"
    
    environment:
      PLATFORM: test
    
    depends_on:
      chrome:
        condition: service_healthy
    
    restart: always
    
    networks: 
      - docker_network01

networks:
  docker_network01: 
    name: docker_network01

Dockerfile

FROM python:latest

WORKDIR /app

COPY requirements.txt ./

RUN apt-get update && apt-get install -y default-mysql-client

RUN pip install --no-cache-dir -r requirements.txt

CMD [ "python", "app.py" ]

app.py is a failrly simple lightweight script with no CPU or memory usage, running 24/7. It produces a lot of logs into container as well as writing into the MySQL database. After approximately 15-18 hours, it stops saving logs on around line #47,000 (I can see a last timestamp), it then continue working and saving data to the database for about one more hour and then it stops - no logs (except 47k of successfull runs history), no new data saved to the DB.

When python execution suddently stops, I can attach to the container, see files there, etc. But I cannot stop it - “docker compose stop” keeps counting seconds indefinately without stopping it. The only solution at this point is to do “systemctl restart docker.socket docker.service” which then keeps it going for another 15 or so hours.

What I my options to troubleshoot?

meyay · May 24, 2023, 8:31pm

This is indeed strange behavior.

Have you checked the daemon logs?
journalctl -xu docker.service should give some more insights.

Apart from that, I would suggest getting rid of tty: true as it keeps the container open, even though the foreground process might be terminated.

I would also recommend configuring your application logs to log directly to the console or stdout/stderr, this way the logs are stored on the host, outside the container filesystem.

Furthermore, please share the output of docker version and docker info.

luckylemon33 · May 24, 2023, 8:40pm

Have you checked the daemon logs? journalctl -xu docker.service should give some more insights.

I have not, no and I will do next time it occures.

I would also recommend configuring your application logs to log directly to the console or stdout/stderr, this way the logs are stored on the host, outside the container filesystem.

This is what I have. When I say logs, I mean the actual dockers logs which are saved in json file. I am looking at them.

Furthermore, please share the output of docker version and docker info .

Docker version:

Client: Docker Engine - Community
 Version:           24.0.1
 API version:       1.43
 Go version:        go1.20.4
 Git commit:        6802122
 Built:             Fri May 19 18:06:21 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          24.0.1
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.4
  Git commit:       463850e
  Built:            Fri May 19 18:06:21 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.21
  GitCommit:        3dce8eb055cbb6872793272b4f20ed16117344f8
 runc:
  Version:          1.1.7
  GitCommit:        v1.1.7-0-g860f061
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Docker info:

Client: Docker Engine - Community
 Version:           24.0.1
 API version:       1.43
 Go version:        go1.20.4
 Git commit:        6802122
 Built:             Fri May 19 18:06:21 2023
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Containers: 19
  Running: 0
  Paused: 0
  Stopped: 19
 Images: 115
 Server Version: 24.0.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8
 runc version: v1.1.7-0-g860f061
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.19.0-42-generic
 Operating System: Ubuntu 22.04.2 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 7.67GiB
 Name: homeserver
 ID: VJCA:TAI5:3WJS:HCRD:MST5:K62R:OLMX:CPSP:NP3U:GYDG:WNTY:KU5C
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

luckylemon33 · May 24, 2023, 8:44pm

As an additional info, exactly same symptoms were happening with unmodified Selenium Hub container. App itself is not responding, nodes not connecting, can’t be stopped unless docker is restarted, but I could attach to it and see filesystem. So I assume it should not be my python application, but rather some incompatability/configuration of docker itself.

meyay · May 24, 2023, 9:20pm

Since 24.0 in general is rather new, it’s worth checking whether there is already an active issue for this https://github.com/moby/moby/issues

The output of docker info looks good to me. We can see that the overlay2 storage driver is used and everything else looks pretty much like on every other Ubuntu 22.04 system.

Next time you have the issue, you might want to check the system load as well (e.g. with top, htop or even uptime). The numbers are average load during the last minute, the last fife minutes and the last fifteen minutes. With 4 cpu’s your system can handle load up to 4.0, everything beyond that point will cause system behavior to slow down. This is just a shot in the dark, but I feel it is worth mentioning it.

luckylemon33 · May 24, 2023, 9:30pm

Thanks for your help. I actully updated to 24 from 23 hoping my issue is resolved, but it did not help. As a temp solution I will finish my script every few hours with “restart: always” policy.

meyay · May 25, 2023, 6:24am

Have you tried if it makes a difference if tty: true is removed? Of course, you need to make sure your Python must be running as foreground process to keep the container running.

Some people use tty: true as a workaround because their application is not kept running. Though, it might interfere with the container lifecycle.

You still might want to raise an issue, as the behavior you experience is unexpected.