Novice: cannot locate my files using volume

I am struggling with finding the .txt file which should be created by my Python program. I use the following commands:

sudo docker build -t newdocker1 .
sudo docker volume create volume1
sudo docker run -v volume1:/home/ilze/PycharmProjects/pythonProject newdocker1

The volume is created, the path “/home…/pythonProject” leads to actual directory where my dockerfile and .py file along with venv lives. Everything executes without throwing an error, on-screen outputs are fine, but I cannot locate the created file. What am I doing wrong? I am new to dockers and Linux system as well.
When I execute the main .py before putting into the docker, .txt file is located next to main.py.

My dockerfile is simple:

FROM ubuntu
RUN apt-get update
RUN apt-get -y install python
COPY . .
ENTRYPOINT [“python”,“main.py”]

What do you need the volume for?

Below, you’re copying everything from the current folder (which I assume is /home/ilze/PycharmProjects/pythonProject) into the image. And all that will be available in the root of your container’s file system:

You could copy it into a specific folder using, e.g.,

WORKDIR /app
COPY . .

Regardless, with the above, the files from your current folder are already copied when creating the image. You don’t need any volume to access those files when running the image in a container.

Next, whatever file is created or changed in the container by your app is available to your app, but is ephemeral and will be gone when you delete the container. So, maybe you want a volume to store/persist data as created by your app? This data (the .txt file the app created) will then be available until you remove the volume. But then the name/location within the container should probably be unrelated to /home/ilze/PycharmProjects/pythonProject. It would make more sense to use, e.g., -v volume1:/data to make the volume available as /data to the app in your container.

If, when running the container, you also want to access some files on the host directly (or to easily access the created .txt file from your host), then you’d want a bind mount (to mount an existing file or directory on your host machine to the container), not a Docker volume (managed and stored within Docker itself). For that, there is no need to first run docker volume create volume1. Just like with a volume, the bind mount syntax expects the source (for a bind mount: the location on the host) first, and the location in the container second. Like so:

docker run -v /home/ilze/PycharmProjects/pythonProject:/data newdocker1

This would make /data available to the app running in the container, and that would still exist after deleting the container.

As the above examples use /home/ilze/PycharmProjects/pythonProject in both WORKDIR /app with COPY . . , and also in -v /home/ilze/PycharmProjects/pythonProject:/data, the files in /app would show the situation like it was when you build the image, and the files in /data would also show any changes made afterwards. (So, it makes more sense to use a different source folder for /data, like /home/ilze/data/pythonProject.)

See also Manage data in Docker | Docker Docs.

1 Like

First of all, thank you for your extensive reply!
I used volume because one of the app produces files which are necessary for both other app (container from other image) and users. Therefore I assumed that the volume is the right option.
I included WORKDIR /app in my dockerfile and ran

docker run -v /home/ilze/PycharmProjects/pythonProject:/data newdocker1

However I still can not find the resulting file on host’s file system. When I used named volume I looked in /var/lib/docker/volumes/volume1 (where I did not find any file). Where the file after creation on runtime is located in bind mount case?

I got it working by including WORKDIR /app in my dockerfile and running

sudo docker build -t newdocker1 .
sudo docker volume create volume1
sudo docker run -v volume1:/app newdocker1

My files now are in var/lib/docker/volumes/volume1/_data
Thank you for guidance!

Nice that it works, but I wonder if it works like expected.

I assume you’re seeing all your files there: the one you copied into WORKDIR /app in the Compose file, and anything the app creates. Right?

Note that the documentation explains:

If you mount an empty volume into a directory in the container in which files or directories exist, these files or directories are propagated (copied) into the volume.

So, when the container is started for the first time, the files you copied using WORKDIR /app will also be copied to the volume you mount in the very same /app. However, as the volume will not be deleted when you restart the container, the volume will not be empty after rebuilding the image and restarting the container. In other words: in var/lib/docker/volumes/volume1/_data you’ll keep seeing the first version of whatever you copied using WORKDIR /app during the first build.

Also, I’d very much recommend to separate data and code. Put your application code in, e.g., /app and make the application use another folder for its data, say /data.

And beware of the “Non-Docker processes” part in the following:

Volumes are stored in a part of the host filesystem which is managed by Docker (/var/lib/docker/volumes/ on Linux). Non-Docker processes should not modify this part of the filesystem. Volumes are the best way to persist data in Docker.

Without knowing anything about your use case, bind mounds may really be the better choice to share data between the host (users) and the container(s).

(Aside, Mac users won’t even see the Docker volumes on the file system directly, as Docker uses a VM.)

Even more so: this implies that Python, which is looking for main.app in WORKDIR /app will always be using the first version of your application code once that was copied into the volume, even if you build a new image afterwards. Like when using the official Python base image, with a Dockerfile like:

FROM python:3
WORKDIR /app
COPY . .
CMD ["python", "./main.py"]

With the above Dockerfile, the following creates a main.py with only print('Hello world'), executes the first Docker build, and runs the container for the first time using a new, empty, volume:

echo "print('Hello world')" > main.py
docker build -t mytest .
# This will create "myvolume" if it does not exist yet
docker run --rm -v myvolume:/app mytest
Hello world

So far, so good. Or, so it seems! See what happens when changing main.py to print something different, but while mounting the same existing volume to /app:

echo "print('Goodbye world')" > main.py
docker build -t mytest .
# This will mount the existing "myvolume" to /app, holding the old main.py
docker run --rm -v myvolume:/app mytest
Hello world

Next, removing the volume and running the very same image, which now will create a new empty volume again and copy the updated contents of /app from the last image into that:

docker volume rm myvolume
docker run --rm -v myvolume:/app mytest
Goodbye world

(Clean up after the above: docker volume rm myvolume.)

I doubt your users need access to the application code in the container? So make your application write its output to something like /data, not to /app, and only share that /data. In fact, make that a configuration which you can access using os.environ or os.getenv, like os.getenv("MYNAME", "world"):

echo "import os\nprint(f'Hello {os.getenv(\"MYNAME\", \"world\")}')" > main.py
docker build -t mytest .

docker run --rm mytest
Hello world

docker run --rm -e MYNAME=ilzhuu mytest
Hello ilzhuu

I missed this question from an earlier post:

For bind mounts, you are sharing an existing folder (or file) on your host machine with Docker. Anything the application in the Docker container does with that shared folder will simply be visible in that host folder. And the other way around: if you change something on the host machine then the application in the container can see that. So, if you share your $HOME folder using a bind mount, then the application has direct access to everything in the user’s home folder. See Bind mounts | Docker Docs.

Just to be sure: note that the name in the container does not need to be the same as the original folder: using, say, -v $HOME/my/folder:/data shares the host directory /home/ilze/my/folder and makes it available as (“mounts it as”) /data in the container. You can even share the very same host folder (or volume) to multiple locations in the container, like -v $HOME/my/folder:/input -v $HOME/my/folder:/output would show two folders /input and /output in the container, which would both map to the same folder on the host. In the container, the two folders would also show the same files; if the application creates a file in /output it would also see it in /input.

(Same goes for volumes: in your very first post, using -v volume1:/home/ilze/PycharmProjects/pythonProject made the volume available in the container as /home/ilze/PycharmProjects/pythonProject, which matched the name on your host machine, but really did have nothing to do with that folder on the host. If you would have used -v volume1:/some/other/folder then the same volume would have been available in the container as /some/other/folder.)

Thank you for pointing out possible drawbacks. I start realizing that perhaps I need a bind mount not a volume. I have one piece of code (expect to put in one container) which creates a machine learning classifier from the provided input file) and produces a classification model and some other files as output. Another program (second container) uses this classification model along with a human-provided data file to produce classification output (another file). So the files are used/provided both by programs and users.

How to that?
I did this,

but how to make appl take and write files somewhere else, e.g. ‘/ data’?:

That’s really up to your application. I assume your Python code is now using “the current” folder, which is the same folder where the Python script itself lives? You’d somehow need to make that code use a different folder (some full path, preferably a configurable full path, for example using environment variables and os.getenv like I explained 4 days ago, which you then set to /data when using Docker, or set to some /home/ilze/my/folder when not using Docker).

One would need to see your Python code to help, but that’s also out of scope of this forum as it’s really not related to using Docker; you should be able to make your code use a configurable folder regardless if you’re using Docker or not.

(Also, the WORKDIR /app is really only an instruction for Docker to set the current folder when using relative paths in COPY and when running your CMD. It won’t affect how your Python code does things, as your Python code does not know it’s running in Docker.)