Docker containers taking up all space over time

Hi,
I’ve been using docker compose deployments on different servers. On each of the server I have an issue where a single container is taking up all space over a long period of time (±1month). When analysing the disk usage with du -sh most of the usage is located in var/lib/docker/overlay2, but the numbers do not add up. For example df -h gives me

df -h
/dev/mapper/ubuntu--vg-ubuntu--lv  217G  202G  5.7G  98% /

while

sudo du -cha --max-depth=1 / | grep -E "M|G"
1.2G	/snap
1.6M	/run
4.1G	/swap.img
1.3G	/opt
211M	/boot
17G	/home
5.4M	/etc
3.3G	/usr
34G	/var
60G	/
60G	total

digging a bit further in the /var folder we notice that the overlay2 folder is taking up most of the space, but nowhere near the 200Gb reported

sudo du -cha --max-depth=1 /var/lib/docker | grep -E "M|G"
20G	/var/lib/docker/overlay2
2.8G	/var/lib/docker/containers
18M	/var/lib/docker/image
22G	/var/lib/docker
22G	total

My docker compose file spawns 3 containers. There is one of them that I can kill and suddenly all the space gets freed.
I’ve read several StackOverflow & docker forum threads, but still could not find the root cause of this issue. The most interesting post that I’ve found is the following: server - Docker overlay2 eating Disk Space - Stack Overflow

The explanation regarding overlay2 creating a diff on each append/delete/create can fit in the behavior I’m noticing.
It still does not explain why du -sh cannot pick up on all the used data & how to track this down. I’ve tried to use volumes on all the services that generate logs, yet the issue still persists.
The two other containers that are running in the same docker compose do not have this problem. For reference, the problematic container is based on Docker
The image I use just adds some python libraries to it and mounts the python source code into the container.

When entering the container using docker exec -it <container_name> /bin/bash & executing du -sh I only see an usage of around 4Gb.
The servers run on ubuntu 20.04 with Docker version 20.10.21, build baeda1f
I also run some older servers which are one 18.04 & have exactly the same problem.

Anyone knows what might be causing the behavior I’m noticing and/or how to debug this?

Thanks in advance!
Br
Alex

I moved you topic to “Docker Engine” since the question has nothing to do with Docker Desktop.

I’m not sure why du gives you smaller size than df, but when one container takes up almost all the disk space, it could be because of large amount of log files in the container or wrong Docker settings and large, not rotated docker logs which is not part of the containers filesystem but is deleted when the container is deleted. The other reason is what you suspect, but I’m not sure you interpret the answer on stackoverflow correctly. Just because you have a lot of operation in the container, it will not use more and more space indefinitely. You have an image filesystem (read-only) and a container filesystem (writeable). You can add new files on the writeable layer, and change them or edit a file which is not on the writeable layer yet. Then in case of overlay2, the data from the read-only filesystem is copied to the writeable layer and you can edit it there. Multiple editing will not copy the files multiple times. When a file is already on the writeable layer, that is a regular file.

So if the issue is the copying a file to the writeable layer, that must be a very big file but you can’t get a bigger size than twice the size of the image. If the size of the container from inside is 4GB, removing the container would not free more then 4GB plus the size of the Docker container log and stopping the container should not free any space. Except when you are not actually killing the container but stopping (sending SIGTERM) and processes can remove files before exiting. The followinf command can take you to the folder of the container in the docker data root:

container=test
cd $(dirname $(docker inspect "$container" --format '{{ .GraphDriver.Data.MergedDir }}'))

Change the value of “container”. Then you can check the size of that folder and what you can find there. There are multiple folders. “diff” stores the everything that changed compared to the image. “merged” has the filesystem you can see from the container containing everything that the read-only layer contained. So when you use du to determine the size of the docker data root, you should also use the -x flag, otherwise you would get a bigger size. The interesting thing is that your problem ws not that du showed a bigger size but df. So I can’t explain it and definitely not the big difference.

I would put my money on “chatty logs combined with no log rotation are responsible”.

@meyay made me realize I forgot to mention the folder that actually stores the json logs. I assumed it would be the same folder I shared in my previous post, even though I knew it wasn’t… So another command to to the folder of the container metadata and some mounted files

container=test
cd $(dirname $(docker inspect "$container" --format '{{ .LogPath }}'))

Thank you for your takes!
I’ve executed the commands you provided.
The folder I ended up in was 280KB. Log rotation was set up btw and capped at 1000MB for each container. Your explanation forced me to start digging into why du -sh & df were reporting different values. One of the reasons this could happen is deleting a file while it is being held open by another process (more info : https://www.linuxquestions.org/questions/linux-general-1/different-results-in-du-and-df-841145/) . Apparently, the base docker I was using was creating log files that, I think, were being deleted by another process. The logger from the base docker did not close the filehandler & hence the file was deleted but the file was still open for some reason. Killing the docker killed the logger inside the docker & hence closed the filehandler, freeing up space. This was not really a docker problem after all… Thank you for pointing me in the right direction & providing valuable info!
Br
Alex