Same image of MySQL, same docker version, very different startup performance

I’m having a very weird problem I’m trying to troubleshoot. We have an inhouse copy of MySQL which is prepared with data that we pass around as a docker image. Just run the docker image and you get an instance of the database, and changes are written directly into the container (no mounted volumes), just remove and restart the container to get back previous state. Very convenient, very popular.

Anyway, I have noticed that in different docker installations the very same image behaves differently. In one environment, the container starts and MySQL accepts connections immediately after the docker run command returns. In another environment it takes about 1 minute before connections are accepted.

Information about slow environment (docker info)

 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 24.0.5
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 0cae528dd6cb557f7201036e9f43420650207b58.m
 runc version: 
 init version: de40ad0
 Security Options:
   Profile: builtin
 Kernel Version: 6.4.10-arch1-1
 Operating System: EndeavourOS
 OSType: linux
 Architecture: x86_64
 CPUs: 20
 Total Memory: 31.02GiB
 Name: matthias-precision5570
 ID: 92da1558-3509-4f76-aac0-ccecf9bee9b7
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
 Live Restore Enabled: false

Information about fast environment (docker info)

 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 24.0.5
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 8165feabfdfe38c65b599c4993d227328c231fca
 runc version: v1.1.8-0-g82f18fe
 init version: de40ad0
 Security Options:
   Profile: builtin
 Kernel Version: 4.18.0-348.7.1.el8_5.x86_64
 Operating System: CentOS Linux 8
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 15.46GiB
 Name: localhost.localdomain
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
 Live Restore Enabled: false

Please note that the slow environment actually has worse specs. The fast environment is running XFS, the slow one EXT4. We have tested to reinstall OS with XFS instead, this makes no difference. The kernel version in the slow environment is 6.4.10-arch1-1, but 4.18.0-348.7.1.el8_5.x86_64 in the fast environment. This is the only meaningful difference I can find.

But how can I troubleshoot what’s happening? Obviously it’s much more attractive to have a startup time of less than one second. Closing and running a new container instance of this image is a very common operation.

Edit: Forgot to mention that we have also reproduced the problem in two virtual machines running on the same host machine. Basically from out point of view in simple terms: Install CentOS 8 and you get fast performance. Install EndeavourOS and you get slow performance.

The new kernel of the first machine uses cgroup v2, while the second machine uses cgroup v1
You could try if reverting v2 back to v1 makes a difference: