I would like to make it easier for contributors to my project to get started by dockerizing not only the deployment of the application, but also the development environment. The second thing is proving to be much harder than the first thing.
Goals of the setup:
1 Users can edit repo files on their native file system, then compile or otherwise operate on those files in a Docker container
2 Builds cache persists beyond individual containers
3 Program outputs persist beyond individual containers
4 No files owned by root are left on the host file system after working in the container
5 (preferred) - container doesn’t run as root - runs with least possible privilege
I think these are ordinary goals that anyone setting up a dockerized dev environment would have, but it’s honestly not clear to me how to set this up. Docker is fighting me every step of the way. I don’t see how any of my problems are unique to me, any dev environment would need these things - right?
Suppose the root of my project’s repo is $REPO.
(1) I think means I want to bindmount $REPO:/repo
(2) Because it’s hard to know where each dev’s system build cache is located, it’s hard to bindmount that. I would prefer the build cache to be a managed volume rather than a bindmount. So, I’ve been trying to define a managed volume, buildcache:/path/to/buildcache. I think any dev environment that has cacheable build objects needs to do something like this. We don’t want to download all of our project’s packages and rebuild the world every time we launch a container.
My build system happens to be Rust’s cargo, and cargo caches 3rd party packages (crates) at $CARGO_HOME, and other build objects at $REPO/target. I’m bindmounting $REPO, so that will persist. I just have to have a buildcache volume and make sure that $CARGO_HOME points to the mount point.
(Another problem, but this one is really I think this is Cargo’s fault, is that $CARGO_HOME is not only a place where things get cached. It’s also where Rust stores the installed toolchain. So if I change the toolchain in an image layer and then mount a clean $CARGO_HOME on top of it when starting a container … I blow away installed toolchain. Cargo really ought to give me a single directory that serves as a build cache and nothing else, so I can mount a volume there. There’s already a bug filed about this on Cargo’s github.)
(3) I can chose to have my program output files somewhere underneath /repo, say /repo/program_outputs, which is bindmounted to $REPO/program_outputs. That takes care of this requirement.
(4) This is where everything goes to hell. Docker wants to run everything as root. Argh. Argh! Why? Can’t there be a flag that says “I want to run this as myself”?
As it is, by default Docker leaves root-owned files in $REPO/target and $REPO/program_outputs. The simplest things, like analyzing my program_outputs with an analysis program on the host system, are now a colossal pain.
I want to run with the same UID and GID in the container as I have on the host. Isn’t this a normal thing to want? But in order to do it I have to do some sort of janky ARG
thing to pass UID and GID to the image layer
ARG HOST_UID
ARG HOST_GID
RUN useradd myuser
RUN usermod -u ${HOST_UID}
RUN usermod -g ${HOST_GID}
USER myuser
Now instead of letting all my contributors use docker-compose run
I have to provide some sort of shell script that makes sure the image gets built with the right UID and GID passed in.
But even when I do that, it’s not enough to get this container working right - the managed volume buildcache
is always owned by root, so, when I run as an unprivileged user with my host UID, the buildcache
volume is unwriteable.
I read somewhere in the bowels of docker’s github issues that when a managed volume is created, it inherits the UID and GID from the image that first used it, and this is the blessed way to handle the problem. Well, unfortunately, it doesn’t seem to work at all. My managed volume is always owned by root. Is it because I’m defining the volumes in my docker-compose.yml?
# this doesn't make the buildcache volume owned by myuser
ARG HOST_UID
ARG HOST_GID
RUN useradd myuser
RUN usermod -u ${HOST_UID}
RUN usermod -g ${HOST_GID}
RUN mkdir /path/to/buildcache
RUN chown myuser /path/to/buildcache
USER myuser
Thanks in advance for any advice. Sorry if this came across as a whine. I have been headbanging for a while now.