Questions around cgroups

Hi all,

I’ve been trying to get a deeper understanding of Linux cgroups and their use with containers/systemd over the last few months. I have a few questions, but given the amount of context around the questions I’ve written up my understanding in a blog post at Cgroups Introduction and the questions in another blog post at Cgroups Questions.

If anyone has any thoughts/input/answers that would be much appreciated! I’m planning on cross-posting in a few places such as systemd/podman mailing lists/communities, but I imagine this community might be quite well placed to give input around the majority of the questions as they primarily involve Docker.

To summarize the questions (taken from the second post linked above):

  • Why are private cgroups mounted read-only in non-privileged containers?
  • Is it sound to override Docker’s mounting of the private container cgroups under v1?
    • What are the concerns around the approach of passing -v /sys/fs/cgroup:/sys/fs/cgroup (for running systemd) in terms of the container’s view of its cgroups?
    • Is modifying/replacing the cgroup mounts set up by the container engine a reasonable workaround, or could this be fragile?
  • When is it valid to manually manipulate container cgroups?
    • Do container managers such as Docker correctly delegate cgroups on hosts running Systemd?
    • Are these container managers happy for the container to take ownership of the container’s cgroup?
  • Why are the container’s cgroup limits not set on a parent cgroup under Docker?
    • Why doesn’t Docker use another layer of indirection in the cgroup hierarchy such that the limit is applied in the parent cgroup to the container?

Thanks in advance,
Lewis