I’m having a very weird problem I’m trying to troubleshoot. We have an inhouse copy of MySQL which is prepared with data that we pass around as a docker image. Just run the docker image and you get an instance of the database, and changes are written directly into the container (no mounted volumes), just remove and restart the container to get back previous state. Very convenient, very popular.
Anyway, I have noticed that in different docker installations the very same image behaves differently. In one environment, the container starts and MySQL accepts connections immediately after the docker run command returns. In another environment it takes about 1 minute before connections are accepted.
Information about slow environment (docker info)
Server:
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 1
Server Version: 24.0.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 0cae528dd6cb557f7201036e9f43420650207b58.m
runc version:
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.4.10-arch1-1
Operating System: EndeavourOS
OSType: linux
Architecture: x86_64
CPUs: 20
Total Memory: 31.02GiB
Name: matthias-precision5570
ID: 92da1558-3509-4f76-aac0-ccecf9bee9b7
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Information about fast environment (docker info)
Server:
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 1
Server Version: 24.0.5
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 8165feabfdfe38c65b599c4993d227328c231fca
runc version: v1.1.8-0-g82f18fe
init version: de40ad0
Security Options:
seccomp
Profile: builtin
Kernel Version: 4.18.0-348.7.1.el8_5.x86_64
Operating System: CentOS Linux 8
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 15.46GiB
Name: localhost.localdomain
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
docker.internal.pe:5000
127.0.0.0/8
Live Restore Enabled: false
Please note that the slow environment actually has worse specs. The fast environment is running XFS, the slow one EXT4. We have tested to reinstall OS with XFS instead, this makes no difference. The kernel version in the slow environment is 6.4.10-arch1-1, but 4.18.0-348.7.1.el8_5.x86_64 in the fast environment. This is the only meaningful difference I can find.
But how can I troubleshoot what’s happening? Obviously it’s much more attractive to have a startup time of less than one second. Closing and running a new container instance of this image is a very common operation.
Edit: Forgot to mention that we have also reproduced the problem in two virtual machines running on the same host machine. Basically from out point of view in simple terms: Install CentOS 8 and you get fast performance. Install EndeavourOS and you get slow performance.