Persistent storage copy policy

hi,
i want to use a persistent named storage,
iv’e read in the documentation that when i mount a named volume to a container dir that already has data in it, it will be copied to the persistent storage.

that’s OK , but what will happen next time, when i run a new container instance with the named storage?
now the named storage has updated data that was created in the last run, and the container has the old data from the image in its directory…

what will happen then ?

what is the best practice to use persistent named storage as a database storage etc.?

thanks
David

It’s more complicated than that. IIRC, if you docker run -v name:/path/in/container, and the named volume didn’t already exist, and the Dockerfile first COPYd content into the path then declared it a VOLUME directory, then docker run will create the volume and copy content into it. In any other case – -v naming a host directory, something you docker volume created, the volume from a previous docker run – the existing contents of that volume/host directory will be mounted over the container path and whatever was in the image gets completely ignored.

(This is one of a reasonably long list of Docker features that I’d never use: since you always have to be able to start from an empty data directory, just make that be the default in the container and don’t confuse users with these special cases.)

The named storage is always used unmodified, and the data in the image is completely ignored.

IME, if you’re in an environment like Kubernetes that supports storage as a higher-level concept, use things like its persistent volume claims. If you’re in a single-host setup, prefer host paths to named Docker volumes (your existing backup/migration schemes will work fine without introducing something Docker-specific).

(Disclaimer: I’ve never actually used Docker Swarm and have never looked at the fancier storage drivers.)

(And as an image author, don’t try to pre-initialize VOLUMEs, and COPY your application in rather than getting it from a volume; in both cases so that the image is reproducible without depending on details of what’s already on the host or not.)


Experimenting with this isn’t that hard and might help give you some intuition into what’s going on…

FROM busybox
RUN mkdir /data
RUN echo hello world > /data/out.txt
VOLUME /data
CMD ["/bin/sh", "-c", "cat /data/out.txt && echo goodbye >> /data/out.txt"]

If you:

docker build -t volumedemo .
docker run --rm volumedemo
docker run --rm volume-demo

it will implicitly create a volume for you both times, and print “hello world” twice.

docker run --rm -v vv:/data volumedemo
docker run --rm -v vv:/data volumedemo
docker volume rm vv

it will implicitly create a volume for you the first time; the second time it will reuse the implicitly created volume and print “hello world”, “goodbye”

mkdir d
docker run --rm -v $PWD/d:/data volumedemo

will fail, because /data/out.txt is absent

docker volume create vv
docker run --rm -v vv:/data volume-demo
docker volume rm vv

much to my surprise works; it looks like Docker keys off the named volume being totally empty?

docker volume create vv
docker run --rm -v vv:/data busybox touch /data/foo
docker run --rm -v vv:/data volumetest

fails

mkdir d
echo foo > d/out.txt
docker run --rm -v $PWD/d:/data volumetest
docker run --rm -v $PWD/d:/data volumetest
cat d/out.txt

you will see the contents of the host file changing

1 Like

thank you for the detailed answer, it helped a lot.
in my scenario i will probably mount the server file system to the container file system instead of using volumes.