Docker Volume Syncing with Pre-Existing Container Data

I’m facing a problem with Docker where syncing a volume between my host and a Docker container results in the loss of pre-existing data in the container. Specifically, when I mount an initially empty volume from my host to the container, it doesn’t sync any data previously set up in the container’s specific directory.

Here’s a brief overview of my Docker setup:

Dockerfile:

# Using Python 3.9 as base image
FROM python:3.9

# Set the working directory in the container
WORKDIR /app

# Install necessary packages and setup environment
RUN apt-get update && apt-get install -y git rsync
RUN pip install --upgrade pip virtualenv
# [Other setup commands...]

# Creating directories for various components
RUN mkdir component1_dir component2_dir component3_dir

# [Additional configuration and environment setup...]

# Add and set up the startup script
COPY startup.sh ./
RUN chmod +x startup.sh

# Define the command to run the startup script
CMD ["./startup.sh"]

Startup Script (startup.sh):

#!/bin/bash

# Syncing files from the mounted host volume to the /app directory in the container
rsync -a /host_volume/ /app/

# Command to keep the container running
tail -f /dev/null

Docker Run Command:

docker run -it --name my_container -e SOME_ENV_VARIABLE=%VALUE% -v my_host_volume:/host_volume my_docker_image

The problem occurs when I mount an empty volume (my_host_volume) from my host to the container. It is supposed to sync the host_volume my_docker_image with app with the bash file but it does not, the host_volume my_docker_image does not seem to be in sync and remains empty. I need to ensure that the existing contents of /app are preserved while still being able to sync new or changed files from my_host_volume. How can I achieve this without the volume mount deleting the pre-setup content in /app?

normally you wouldn’t even need rsync in a container. You would mount a volume and the application would use it. If it is some kin of base data, then you should create a volume, populate it and create the container mounting the volume to the final destination.

If you mounted the data somewhere else and you still want to use rsync, that is not a Docker problem. If you think so, please, share how it is related to Docker. It is possible I misunderstood the whole problem.

I tried to mount a volume, with the intention of setting up data within the container and then retrieving the newly created data using volumes on the host machine. So far, my attempts to accomplish this have been unsuccessful. I believe that this is a Docker-related issue, as I’ve been able to easily mount local files into the container, but I’m facing challenges when trying to perform the reverse operation.

I still don’t understand that “reverse operation” part. Maybe you need a volume with a custom path?

I wrote about that here: https://dev.to/rimelek/everything-about-docker-volumes-1ib0#custom-volume-path

That would work as a volume so data in the container would be written to the volume automatically if it is needed, but you could also get the files in the custom folder instead of using the default named volumes.

What I meant to say was when you’re building files within a Docker image (create files not present on your host machine) and you later wish to access these files when the container is running, mounting them to the directory where the files in the container were created can lead to a situation where your local machine’s files (if any) overwrite or replace/delete the container’s files. I was exploring a way to create volumes and retain access to files generated during the image build process using volumes. I am not sure as how I can go about it. I do not believe the article wrote is for my needs although thanks for the share,

Now I do think that the article is for you even more than I though before your last post. You have to understand what a volume is. Mounting a volume will never override files in the container unless the volume is not empty.Bind mounts will always override the content in the container, but you wrote about volumes, not bind mounts. Even with volumes with custom path or bind mounts, you can’t access data in the contaier and on the host nless you understand how you can set proper file permissions. You also seem to mount the volume to /app, which indicates that you put your whole app to a volume. Use volumes for data used in the container and use bind mounts for sharing your sourc ecode with the container if you want to continue developing it.

You can also use the watch feature in Docker Compose so your source code could be syncronized if you don’t want to bind mount it.

I still recommend you to read the whole blogpost I shared in my previous answer, not just the part I directly linked to.

I’ve read it, and I meant to refer to bind mounts instead. Thank you for providing the sources. However, based on your knowledge, do you know a method to preserve access to files generated during the image build process and make them accessible on the host machine simultaneously ?

You mean setting the right permissions to be able to edit the files on the host?

Based on the Dockerfile, it seems the owner of the files should be root, which means you could run rootless Docker, assuming you are using Linux and not Docker Desktop. Then your user becomes root in the container, so anything generated in the container as root, you can edit on the host.

If you meant copying the data from the container to the host, that is why suggested the volume with the custom sourceo path.