For starters, I’m quite new to the usage of docker and containerization. I have using docker for a month and having difficulties in managing the storage use for the images. I’m planning to use docker to install several open-source projects from github for research purpose. So far the storage usage for each image is not so well, I’m talking about 20~80 GB for each image. Is it normal? I’m wondering if using docker compose will be the best practice to manage this as well. The projects that I used has some similiar base dependencies such as CUDA Toolkit and some python packages, so probably it can be installed on the base image or is it better to installed to the WSL installed distro instead of putting it in the container?
Whether you need a container or a WSL distro, depends on the use case. IF you need to reproduce the same environment and share the environment, an image could be better, but 20-80 GB is definitely not normal. You could think about creating a virtual machine image instead. Even with cuda and python packages, 20 GB still sounds huge.
The biggest official CUDA image I see is 4 GB. So if you managed to make it 20 and even bigger, 16+ GB is a huge difference. Whether it can be optimized or not depends on how you made the image. I personally know images being 10GB containing AI and Python related requirements and I saw some images (Linux) from Microsoft being that big so probably 20 is also achievable but I have never seen that. But 80GB is definitely something I would not use for a container and I would try to identifiy the different operations that could be implemented in separate containers communicating with each other. .
Well, that’s why I asked this best practice in the forum. Since I want to find the better way or maybe cross checking if what I’m doing currently is correct. I also need the image since I need to reproduce and share the same environment since installing these repositories to multiple computer is quite a hassle and a challenge in itself.
I’ve checked on what could possibly makes the image big for my 20GB image is likely due to CUDA, tensorflow and other related python packages. I think there are some caches files during the build that need to be cleaned up to reduce the space as well.
But I’m also wondering if setting up a docker-compose with the base image that has CUDA and the base python, will be more space efficient if I’m thinking of creating multiple images that have these requirements?
Docker Compose is nothing but a client with some extra features so you don’t have to write a script for everything, but it will not reduce the size of any image you build.
You can disable pip cache by passing --no-cache-dir and you also need to be careful when you have multiple instructions instaling similar packages as one could override already installed packages when something needs a newer dependency and older files will never be deleted unless you run all the commands in a single instruction defined in the Dockerfile.
You can use the docker image history IMAGENAME command to find out sizes of the layers where you can also see the instruction. To make it more readable you will want to use the --format option and --no-trunc
Example:
docker image history IMAGENAME --format json --no-trunc
You can alos use jq to format the output or just use the go template format
Then you can try changing the instructions, removing some packages, checking if package versions change between layers.
When you have the reason of the image being big, fixing it is not always easy. Years ago in one case, I was asked to optimize a docker image and I had to combine some apt installation commands because one changed the dependencies of another, and I also had to turn off the optional packages which could lead to missing packages too and those have to be added back instead of relying on automatically installed dependencies when most of the packages are not used at all.
Hi there! Great question — what you’re describing is actually quite common when running multiple research-oriented Docker images, especially those involving CUDA or large Python dependencies.
A few quick pointers that should help:
Use a shared base image– If several projects depend on the same CUDA version or Python packages, create a custom base image (for example, FROM nvidia/cuda:12.1-base) and build each project image “on top of that”. This allows Docker’s layer caching to reuse the shared dependencies, significantly reducing total storage.
Clean up intermediate layers and unused data – Run:
docker system prune -a
docker builder prune
These commands remove unused layers and dangling images safely (just make sure you don’t need them).
Docker Compose won’t reduce image size directly, but it’s great for organizing multiple containers and managing shared resources like volumes or networks. It simplifies workflow, not necessarily storage.
Consider multi-stage builds– For projects that need to compile code, use multi-stage Dockerfiles so the final image contains only the runtime dependencies, not the entire build environment.
About WSL – Installing CUDA or heavy Python packages directly in WSL2 can make sense if you use the environment repeatedly for experiments, but it loses the reproducibility benefit Docker provides. A good compromise is keeping a lightweight WSL distro and using optimized container layers for projects.
In short: reuse a common base image, prune regularly, and use multi-stage builds — your disk usage should drop significantly.
Great! Thanks for the suggestion, I just use multi-stage building and carefull cache cleaning and could reduce the image size quite significantly.
I’m actually considering for implementing the base image, but there are some project that has different CUDA versions req., I’m not sure if having multiple CUDA version in the base image is a good idea or not. OR maybe I should just expect the project has forward compatibility related to cuda or vice versa (cuda can have backward compatibility to adjust with the code).