Okay I’m just getting weird results all over…
What I’ve observed are strange things such as:
- If I stop the service (
systemctl --user stop docker
) on the compute node it’s working on (n010), then the service will start normally without issues on another compute node (n011 for example).hello-world
runs. - Now
hello-world
doesn’t run on the head node anymore until I restart docker. Further, docker fails start at all on n010. - It seems I can only successfully start docker on the head node + one compute node, but I can only actually run
hello-world
docker in one of those places, then it corrupts the other running instance.
Obviously, then, it’s something to do with the clustered environment and/or shared resources. Here are the relevant resources I can think of that are shared between the nodes:
- Users’ home directories, which includes
~/.config/docker
and~/.config/systemd
(which has the docker service scripts) - Docker
data-root
So what’s the proper way to let users run docker (rootless mode) in a clustered environment? Is this even possible/supported? will be researching this in the mean time… If not, then I’ll have to get the software running another way…