I have a general questions about data volumes and database services.
I noticed in a docker-compose that none of the databases had volumes for data files.
We are using mysql, kafka, aerospike, redis, and rabbitmq.
I saw a performance win I after I added the volumes. I am running a QE automation tests that bring up dockerized mysql, kafka, aerospike, redis, and rabbitmq.
The tests finished faster with the data volumes.
Question: Is it more performant to use volumes for database services? If you don’t use volumes the data is basically in memory?
If no volume is mapped against a container path, then the data is written into the ephemeral container filesystem. Depending on the storage driver, this can be a full copy of the previous layer + the new date, or just new data in the write layer of the container filesystem. If the container is deleted, its filesystem is deleted as well - it is not suitable for persistent data that should survive re-creation of container (e.g. when you deploy a new container based on the new image of the same repo)
A volume without any declaration on the other hand is nothing else than a docker managed path in the filesystem them gets bind mounted into the container filesystem. Of course this is going to be faster.
You could also declare a volume with the tempfs type, which then would indeed be in memory.
I forget to mention that if a path inside a container is declared as VOLUME, then if no volume is mapped against that folder an anonymous volume will be used - which behaves exactly like a default named volume, just with the difference that the name consists of random alphanumeric characters.
A volume (=storage from outside the container filesystem) is mounted into a path of the container filesystem. So whatever is written into that path is written outside the container.
I don’t understand what you try to tell me with fast and slower, as both use whatever filesystem is mounted in /var/lib/docker, but the container filesystem has a storage driver on top of. It doesn’t make sense the volume that directly writes on the host filesystem should be slower than a storage driver that writes on the host filesystem.
Before you ask further questions, please share the output of docker info, so we see what filesystem and storage driver you use, the exact image (or it’s dockerfile) + your compose file. I want to see and understand what you are doing.
I haven’t notived this is in the wrong catagory: “Tips & HowTos” is mend to be used when you publish them, not to ask for them. I moved it Docker Desktop for Windows and also wrapped your output and compose file in preformated text blocks, so they become easier to read.
Since you use Docker Desktop, try to avoid mapping Windows host folders into container folders, as this is always going to be slow.
Depending on whether you perform docker compose up in a Windows terminal, or in a terminal of a WSL distribution where you enabled Docker Desktop integration for (where the compose file and everything else is also on the filesystem of the WSL distribution and not on the windows host → e.g. /mnt/c, /mnt/d), you might see a huge difference in performance for following services:
With Docker Desktop for Windows, anonymous volumes and named volumes store their data inside the WSL distribution docker-desktop-data, which uses an ext4 filesystem, and is way faster than mapping Windows host folders into the container.
Zookeeper has no volume declaration in your compose file and therefor relies on whether the Dockerfile of the image has VOLUME declarations for container paths, and thus implicitly creates an anonymous volume when the container is created. Generally, you can skip declaring anonymous volumes for containers in the compose file, when the Dockerfile of the used image already declares the path as VOLUME. Though, declaring them makes it obvious that those folders are stored outside the container filesystem in an anonymous volume.
Note: neither anonymous, nor named volumes are deleted when docker compose down is used, unless the argument -v is appended. Make sure that you remove orphaned anonymous volumes every now and then, to not occupy unnecessary space.