Container initialisation, the correct way

Hello,

I would like to create a container with a service. When the container is brought up for a first time it should initialise itself (e.g. create some files, generate certificates, etc.). So latter on, when the container is brought up for the second time (even with the --build flag), the previously generated files are not overwritten. I came with something like this:

#myservice.Dockerfile
FROM ubuntu:24.04
COPY ./entrypoint.sh /scripts/entrypoint.sh
RUN chmod +x /scripts/entrypoint.sh
VOLUME /data
ENTRYPOINT ["/scripts/entrypoint.sh"]
CMD ["echo","Start"]
#!/bin/bash
#entrypoint.sh
if [ ! -f /data/initialised ]; then
    echo "$(date)" | tee /data/initialised
fi
#docker-compose.yml
services:
    myservice:
        build:
            context: .
            dockerfile: myservice.Dockerfile
        volumes:
         - ./data:/data

Seems to be working, but is it correct way to do it? I could not find the straight answer to this problem.

Thanks for Your time!

Hi

I would say Yes i.e. the correct way would be using a volume like you did.

This is how images like Mysql or Postgres works : If a volume exists, a check will be made to see if a database is already present. If yes, db creation steps will be skipped.

So yes, for me, it’s the correct way.

Thanks, I’m not sure if i understand corectly relation between volume created via instruction VOLUME in the Docker file and the volumes instruction from the docker-compose.yml. The VOLUME instruction from Docker file creates an anonymous volume in /var/lib/docker/volume directory and the volumes instruction from the docker-compose.yml file “binds” the hots folder to the containers folder.

So let’s say I want to move my project to another host along with the data folder to keep it already “initialised”. Do I also need to move the anonymous volume (the one created with VOLUME instuction in Dockerfile)?

I’ve never used the volume CLI command, I’m using only the one from the yaml files. And I’ve never moved volumes from one location to another so I can’t say.

If the volume is still there (=you’ve not removed it); if you kill the container and start it again then the volume will still be there and your initialization process is kept.

Do not use the VOLUME instruction. It cannot be undone and you don’t need it as you can easily create anonymous volumes too if you want to without a Dockerfile. It is just using the volumes section without specifying the source, so instead of

- ./src:/dest/

you would just do

- /dest/

The bind is not actually a volume. The volumes section is misleading and I believe it should be deprecated, but it is the only name as far as I know that we can use. You can read about volumes and bind mounts (the one you defined in the compose file) in the documentation:

https://docs.docker.com/engine/storage/volumes/#named-and-anonymous-volumes

You can actually lose data when using anonymous volumes so the best is using named volumes at least or bind mount that you did. But when you use bind mounts, you sometimes need to preset the file permissions if the process in the container is not running as root.

I wrote about named volumes and also named volumes with custom source path which is a good mix of usin volumes (automatically setting permissions and copying files from the container) and having a custom source path which is not in the docker data root unde r/var/lib/docker/volumes. Then you can even delete the volume and your data will not be deleted.

But even named volumes are good for making sure you can always run docker compose down and docker compos eup and still have the original data. With anonymous volumes, you would get a newly generated anonymous volume without your old data.

This could work, but when you run a process in the container as a non-root user, it wouldn’t always work, as the user would not have permission. So you can create at least a subfolder under /data with the right permissions in the image into which your user would be able to write. For example: /data/db/ or /data/yoursoftwarename and then you can do the initialization in that folder.

You would need to, but I don’t think anyone is using anonymous volumes for persistent data. Anonymous volumes can be good sometimes, but mainly when you just want to share sockets or just quickly trying out something and don’t care about persistence so you don’t need a human-readable name for the volume.

And even if you move an anonymous volume to another machine, you couldn’t mount it back to the container unless you specifically refer to the ID so why not use a named volume in the first place. Or bind mount what you already did.

If you need to move named volumes, you can try my scripts too