We can't push a docker volume so now what?

Hi,

I’m trying to find a holistic solution for how to get our compilers and toolchains connected to docker containers efficiently.

Problem:

  • We have several Target platforms for software and Firmware. They range from a few hundered Mbs to 15GBs for Firmware stuff. Different services in our docker-compose.yml’s build different parts of each product. One docker-compose.yml per product which controls differences between dev and prod environments.

Ideal solution:

  • Being able to push docker volumes to a registry or something that docker/docker-compose can automatically pull from if the volume is not on local host.
    Of course we can’t push volumes though.

What won’t work:

  • NFS mounts. They are slow, and we notice our builds are 50+% faster when the compilers/toolchains are on the local systems storage.
    Pushing volumes to regsitry.

Solutions I am contemplating:

  • 1.) Docker volume driver plugin: I have looked around to see if there are any that could help, but I haven’t found any yet. If they exist please tell me. s3 won’t work.

  • 2.) Create a volume, bake it into our production build nodes (ami’s in our case). This however does not allow devs to locally build using this volume and the pipeline to do this gets very cumbersome and doesn’t allow for easy debugging or upkeep.

  • 3.) Follow Eran Avidan’s idea and use an image instead of a volume. This still seems like the most likely but I don’t like it. I have structured my docker-compose.yml’s so devs can use there own compilers or supply a variable so compose uses the image with the compilers instead. docker-compose cannot do this switch without lots of editing and these images takes up cpu usage.

  • 4.) baking them into the product’s main image that has the build logic in it. This would create huge images, and make it cumbersome to update the main image easily and restrict devs from being able to use their own compilers if they want.

Does anyone have any ideas or suggestions I could investigate?

thanks,
Scott

One option I’ve thought of (but haven’t tested) for sharing volume data across hosts, if the data doesn’t change super-frequently, is to use some sort of file sync utility (rsync, bit torrent sync, etc.) to sync changes. This could be done on a timed basis (e.g. cron-like), at the end of a build cycle, or potentially when objects change state. A lot depends on the size of the volumes (you mention some might be many gigabytes), the network speeds, and any latency between pushing images & volumes, and how quickly you need to run those images. I don’t know if this will work, but as a primitive form of remote file sharing, it might be an option.

After much thought we have found a solution. please let me know your thoughts.

TL;DR;
We were able to use an image to create a volume if needed and kill itself, leaving behind a volume and nothing else.

EXPLANATION
We were able to achieve the result we wanted with the use of volumes in docker-compose, Dockerfile volumes, single layer images, and docker-compose depends_on .

Docker images are not supposed to be large, but making them single layers, or if the final layer is the largest or the layer of concern it becomes much less of an issue.

In my Dockerfile I have:

FROM alpine:latest  
RUN cd some_dir && \  
         wget some_files && \     
         tar xf some_files && \   
         rm -f some_files r  
VOLUME ["some_dir"]

Then in my docker-compose.yml multiple services are declared and all are dependent on a service who’s purpose is to create a container, based on the image the above Dockerfile creates, that attaches it’s volume to a volume all the other conatiners attach to.

in My docker.compose.yml I have:

version: "3.5"  
services:  
  myVolume-service:
    image: volume_image
    command: >
         /bin/sh -c "./do_nothing_and_exit_imediately"
    volumes:
    - compilers:some_dir #same dir as specified in above Dockerfile
  service-a:
    container_name: shell
    image: come_custom_aplication_image
    depends_on:
     - myVolume-service
    volumes:
    - myVolume:/dir/to/mount/myVolume:ro    
    command: >
         /bin/bash -c "./do_service-a_stuff"`
volumes:
  myVolume:

This opens up posisblities for other things such as syncing which if used in combination with healthchecks, allows for updates and anything else that might benifit from being in the volume tot get in them at runtime if rebuilding the volume image isn’t an option.

I will update this if we determine that syncing/healthchecks are usefule

1 Like

Hey, what you also can do is using a Docker Image as distribution for data and create a volume out of it.
E.g.: docker run -ti --rm --mount source=wps,destination=/WPS_GEOG/ bigwxwrf/ncar-wpsgeog
This spawns the image and extracts the path given in destination into a volume.

Having docker images directly mountable would be awesome though. AFAIK that is the closest you can get today.