Best practice about named volume in Swarm

antoinetran · November 14, 2016, 11:09am

Hi,

I do not have a clear vision about how to populate named volume in Swarm. I am wondering what are the best practices for named volume, since this now replaces the deprecated data-only containers (DOC) (said by docker member, since feb 2016, source here). The official doc https://docs.docker.com/engine/tutorials/dockervolumes/ stills talk about data-only containers, but shouldn’t (or at least it should write DOC are deprecated).

My use-case is that I will have a Swarm cluster (old Swarm mode for now, Swarm mode later), I will run containers there, and attach to them configuration files in either a DOC or named volume.

With DOC:

I planned for this to put these files from our SCM (git) in a data-only container, put them in a registry, then running this DOC from registry in the cluster, next to the main container, and using “–volume-from”.
For eg: the docker run command would
docker run -d --name Config configfileforHttpd
docker run --volume-from Config httpd

Without DOC:

I would have to describe an external volume in docker-compose.yml, and I suppose I have to create a compose file, with a named volume first, and somehow put the configuration files inside. The origin of these files should be in another container (not technically a DOC), and I will do a
docker run -v [named_volume]:/path/to/volume/internal myimage cp -rf /files /path/to/volume/internal

antoinetran · November 14, 2016, 11:36am

I found an example, someone did the same thing Need design help on DB approach: named volume vs data containers so I guess this is a good practice?

aleveille · November 16, 2016, 2:08am

Hi there!

It’s been a while since I posted that thread. I can however say that we are still using what I described in that thread.

I don’t know if it qualifies for a “Best practice” title, but it sure is a very easy and straightforward way to make sure that at least a default set of file is present, without cluttering the application image with that data or logic to handle the bootstrapping.

What I like about the data-only image I described in the other thread:

Has a separate life-cycle from the application image: an update to the bootstrap data is different from an update to the app, both live independently.
Easy to force a new set of data. In the bootstrap data, I have a version flag. If I update the image with a mandatory update to the data, I increment the version flag in the image. On the next (pull +) run, the container will overwrite the data.
Easy to manage different bootstrap data set with docker tags

Opposed to the above model, we also thought of storing the data on a repository on the cloud, and downloading the data only when required. That would be a little more complex to manage however and I’m not fond of the external/Internet dependancy. However, it could allow to replace data on every container run (not only when pulling the image) or even hot-swap if your data container periodically checks the repo for new version of the data.

Hope this helps,
Alexandre

antoinetran · November 22, 2016, 9:51am

Thank you Alexandre, your answer is very useful, I also liked the idea of bootstrap and force update whenever the version has changed.

However, we will study this practice again (of named volume) when we will have big data (in terms of gigabytes) inside data-only images. The copy into a named volume, to avoid an anonymous volume, might be too costly, instead of just doing a “–volume-from”.