Copy-on-write option

I have an application I would like to debug. However, this application has 40+ TERABYTES of data. This would be… prohibitive to copy every time I wanted to run a container. I don’t actually need a copy of the data, I just need access to what’s there for reading. However, to properly debug the app, I also need to write data, and I do NOT need it to persist. I just need to be able to write while debugging/testing.

To this end, what I would truly prefer to have is a “:cow” option similar to the “:ro” option one can use now when setting up a volume, or adding it to the ‘docker run’ command with ‘-v’ option.

I have been researching this, and though others have at least tried to achieve the same thing (with paltry Gigabytes of data), none of their approaches have worked for me so far. In fact, their most common solution was to simply copy the data even though they didn’t want to do so. This really won’t work for me, especially since there may be multiple people debugging/testing this app at any given time.

Since Docker already implements an overlay system, it would be extremely helpful if it were made available for volumes in this manner.

1 Like

Even I’m not having to deal w/ that much data, I’d like to have that feature.
Also for bind-mounting from the host.

Scenario: you’ve got a huge source tree on the developer workstation and you’d like to run a clean build and don’t want the build container touch your source tree (the build job might like to do things like git clean, etc), just have a snapshot, which then is removed along w/ the containers.

For now, I’m doing a cp --reflink to some temp dir and bind-mount it into the container. But that doesn’t work when you’re coming from already within a container (docker host will misinterpret the source path to be on host side, instead of within the calling container).