Issue raised with Docker engine space utilization

We have installed Docker on a virtual machine (VM) hosted on Azure, where image builds are frequently performed. After building the images, they are pushed to an artifact registry. To free up space on the VM, we use the docker system prune -f -a --volumes command, which is intended to remove unused volumes, images, and build cache. However, despite executing this command, the Docker engine continues to utilize a significant amount of memory. We are unsure of the root cause and, as a temporary solution, have been extending the size of the VM to accommodate the increasing memory usage. Additionally, we prefer not to delete the build cache as it is essential for maintaining faster build times.

In the /var/lib/docker directory, we observe that various folders, including images, network, overlay2, plugins, runtimes, swarm, tmp, and volumes, are present, with the overlay2 folder consuming a substantial amount of space. This folder contains image-related and container-related directories. After pushing images to the artifact registry, we believe these folders and images in overlay2 can be deleted to free up space. However, we are uncertain about the potential consequences of deleting the overlay2 folder and its contents, leading us to avoid this action and instead extend the size of the VM. Additionally, we prefer not to delete the build cache as it is essential for maintaining faster build times.

du will count files in the overlay folder more than once. Please add the argument --one-file-system to make sure the files are only counted once.

Did you try to remove dangling images? Dangling images exist, when you re-build an image with the same tag, so that previous versions get untagged. The same happens if you pull a tag that already existed in the local cache: the tag will point to the newly pulled image, and the previous one gets untagged.

You can remove dangling images as root user like this: docker rmi $(docker images -f "dangling=true" -q)

1 Like

You mean you want to avoid using the command that you use now? Because it deletes unused build cache and I guess it includes all that is not used at the moment.

The docker system prune command is for handling all kind of data at once. But you can use docker buildx prune, which also removes “dangling build cache”, but you can use a filter

docker buildx prune --until 72h

which deletes the build cache older that was last used before the given time period. So you can control what to delete. The cache that was used long time ago could be less important. Or in some cases never used in the future.

Yes, you should never touch the filesystem under the docker data root, but you can use commands like the one shared by @meyay, or there is actually a newer alternative which is

docker image prune

If you want to delete all unused images, not just the dangling one (as you do with the command you shared), you can use the same -a flag.

The overlay filesystem containers container data as well, so if you have running containers writing a lot of data without volumes, that will increase the size of the overlay2 folder.

The following commands tells you more about the currently used space

docker system df

including how much is the case and how much is the images. The following command will tell you even more about the specific objects

docker system df --verbose

And if you want to see more about the cache entries

docker buildx du --verbose

Notice that the fist command used “df” and the second used “du

I don’t think the cache can be deleted one by one (I could not find a solution to that), but at least you know why the cache is large.

1 Like