I have a Dockerfile that downloads a large 2GB tar via curl, untars it and linked to a volume in docker-compose. This volume is then consumed by another service within the compose.
However, because of this, the image size of the Dockerfile is 2GB, and the volume is also 2GB. If I’m understanding it correctly, it’s doubling the size taken in my filesystem.
Is it possible to have a downloader service that stores data directly in a volume without persisting on the image?
Excerpt of my project
Dockerfile
FROM ubuntu:22.04
WORKDIR /project
...
RUN ...
&& curl FILENAME -o $(basename "$FILE_NAME") \
&& tar -xf $(basename "$FILE_NAME") \
&& rm $(basename "$FILE_NAME")
...
Instead of downloading with Dockerfile, we instead download it via ENTRYPOINT the shell script.
The catch is that the ENTRYPOINTonly executes when a container is created. This means that the size of the image is independent of the download!
This does mean that ENTRYPOINTexecutes again when a container is created, starting another download. We can avoid that by checking if the file already exists (which should, given that the volume used is the same).
Minor correction:
This does mean that ENTRYPOINTexecutes every time a container based on the image is started , starting another download. We can avoid that by checking if the file already exists (which should, given that the volume used is the same).
This is true regardless whether a container is fresh created and started, or an existing container is stopped and restarted.