How to avoid transferring everything as build context

The concept of “build context” where whole directory is sent to docker daemon looks totally dumb for me, not clear why we would need to send everything to the daemon instead of reading the files when they are requested. Am I missing anything?

I need to build various images which will be almost the same, and except one heavy (up to a few gigs) binary. There is going to be more than 10 binaries (zip files), but only one of them is needed for one image. Say there are the following directories
mydir/dockerfile
mydir/bin/heavy1.zip
mydir/bin/heavy2.zip

mydir/bin/heavy10.zip
mydir/scripts/script1.sh
mydir/scripts/script2.sh

mydir/scripts/script10.sh
(the scripts are all used every time, but only one binary at a time)

Inside the dockerfile I run COPY, then unzip of one of the zips depending on parameters passed.

I am not sure where I should place them:
a. if in the current directory (or docker file location) or lower, ALL of them will be sent as the build context for any image?
b. if in the directory outside of the docker file, it fails with "Forbidden path outside the build context’.

How to make binaries reachable by docker file without sending all of them to the daemon every time?

You’ll always need to place all the files inside the foler (or a sub dir) where your “Dockerfile” is located.

That is not true. I have the following structure:
mydir/bin/heavy1.zip
mydir/bin/heavy2.zip

mydir/bin/heavy10.zip
mydir/scripts/script1.sh
mydir/scripts/script2.sh

mydir/scripts/script10.sh
mydir/scripts/dockerfile

and having current directory mydir/ I run
docker build scripts/dockerfile

My binaries are not located in the same directory with the docker file, but they are seen in commands like COPY
COPY bin/binary1.zip …

Anyway, again, the question is how to AVOID sending whole current directory with all the heavy binaries to the docker daemon for every “docker build” whereas each build needs just ONE of those binaries.

To avoid copying the whole dir you must either specify aech file separately or use variables inside the Dockerfile -> https://vsupalov.com/docker-arg-env-variable-guide/

How " or use variables inside the Dockerfile" will avoid sending all the binaries when whole current directory with all sub-directories is sent to the docker daemon regardless of what you put in the docker file?

Something does not add up. The folder content and the provided docker buildcommand do not add up. Please paste your exact docker build command without build-args.

Here is the directory structure posted above
mydir/bin/heavy1.zip
mydir/bin/heavy2.zip

mydir/bin/heavy10.zip
mydir/scripts/script1.sh
mydir/scripts/script2.sh

mydir/scripts/script10.sh
mydir/scripts/dockerfile

Here is docker file content
FROM some_base
COPY bin/heavy1.zip /tmp

Here is docker build run from directory “mydir”
docker build -t test -f scripts/Dockerfile .

Like I though. Your original build command was missleading
Use `docker build -t test -f ./scripts/Dockerfile ./scripts" to set the build context to mydir/scripts.

Like mgabeldocker wrote: you won’t be able to access files outside the build context…

I don’t need a “solution” where I won’t be able to access the files I need to access. The build command is not misleading. I can use “./scripts” instead of “.”, but then I would have to move “bin” into “scripts” with the same result - all the binaries will be sent to the daemon.

… I guess you will either have to live unhappy with how docker builds work or open an issue in Docker’s Gitub project.

I didn’t believe people can make so crappy design, hoping maybe I am missed something.

Why not simply move your Dockerfile (and only that one) one directory level up so it sits in mydir ?
I can see you like/want to keep a structure here … but as Metin pointed out - currently that’s the docker design.
So if you “insist” on keeping the structure you’ll need to write a script that will copy the files in the right location, then build the image, and delete the stuff afterwards … Or use a build tool like Jenkins

How moving the docker file one level up will prevent from sending whole mydir to the docker daemon?

I don’t insist to keep any exact structure. I am ready to consider any working solution which will not send all the binaries to the docker daemon. So far the only option I see is where I keep the binaries somewhere outside of current directory (outside of build context) and copy the binary required for each particular image somewhere in the “build context” and then remove after that every time.

All files you want to add to your docker image must be inside the “context”. Otherwise docker will throw an error.
What you can do is to explicitly add … well COPY … only the needed files. This can be specified inside the Dockerfile.
Assuming your Dockerfile resides in /mydir Something like:

FROM ubuntu:latest
COPY bin/heavy1.zip /tmp
ENTRYPOINT [‘myStartScript.sh’]

Now this will copy only the /mydir/bin/heavy1.zip file to /tmp of the image.
(Still assuming you’re in /mydir) Build it with:

docker build -t test .

Bare in mind that this will only copy - not unpack - the .zip File to the docker-image /tmp dir.

I am not getting what exactly is your solution. Before that you said I should use ENV, it will solve the problem, now you are saying I have to move docker file and it will solve the problem.

Again, if I have this structure

mydir/bin/heavy1.zip
mydir/bin/heavy2.zip

mydir/bin/heavy10.zip
mydir/scripts/script1.sh
mydir/scripts/script2.sh

mydir/scripts/script10.sh

and have docker file in mydir
mydir/dockerfile
and run
docker build -t test .

regardless of the dockerfile context ALL the binaries will be sent to the docker daemon before even processing the docker file because they are located in the build context.

Well … you’d use ENV when you want to have a more “dynamic” Dockerfile instead of a hard-wired …
But that’s a different toppic !

Let me show you an example of how I did it:
I have the following structure:

root@deimos:/data/docker/test# ls -R
.:
Dockerfile testdir

./testdir:
dir1 dir2 dir3

./testdir/dir1:
test1.txt test2.txt

./testdir/dir2:
dir2-test.txt

./testdir/dir3:
andnowsomecompletelydifferent myZipfile.zip

And this is my Dockerfile (which resides in /data/docker/test):

FROM ubuntu:20.04
RUN mkdir /var/testdir
COPY testdir/dir3/myZipfile.zip /var/testdir
CMD tail -f /dev/null

Now let’s build it:>

root@deimos:/data/docker/test# docker build -t test .
Sending build context to Docker daemon 9.728kB
Step 1/4 : FROM ubuntu:20.04
—> 1d622ef86b13
Step 2/4 : RUN mkdir /var/testdir
—> Using cache
—> 319ef61748c5
Step 3/4 : COPY testdir/dir3/myZipfile.zip /var/testdir
—> 31b7e2b0004e
Step 4/4 : CMD tail -f /dev/null
—> Running in 3fccbcbb2b3c
Removing intermediate container 3fccbcbb2b3c
—> da63457b11fd
Successfully built da63457b11fd
Successfully tagged test:latest

As you can see only the myZipfile.zip file got copied to the image’s /var/testdir despite the other directories or files.
Let’s check:

root@deimos:/data/docker/test# docker run --rm -d da63457b11fd
4a925bedef0deb223ae31d8d7b30d05aeb4f2cb7dc28fd0577d91c9bc15793c5

root@deimos:/data/docker/test# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4a925bedef0d da63457b11fd “/bin/sh -c 'tail -f…” 5 seconds ago Up 4 seconds affectionate_meninsky

root@deimos:/data/docker/test# docker exec -it 4a925bedef0d /bin/bash
root@4a925bedef0d:/# cd /var/testdir/
root@4a925bedef0d:/var/testdir# ls -la
total 12
drwxr-xr-x 1 root root 4096 May 20 11:19 .
drwxr-xr-x 1 root root 4096 May 13 09:49 …
-rw-r–r-- 1 root root 504 May 20 11:11 myZipfile.zip

You see, only the .zip file is present here - as it is - unpacked.

If you look carefully on the output of your building you can see this message:
Sending build context to Docker daemon 9.728kB

This is where ALL your files from the build context (in your case /data/docker/test with everything inside) is transferred to the docker daemon.

Now, don’t change your docker file, but add into anywhere inside /data/docker/test 10 another zip files with size 2Gb each, and re-run the build, and see what will happen to “Sending build context to Docker daemon”.

Oh now I get you …

Currently this “effect” is by design → Docker docs: Build with PATH

BUT … I just stumbled over some stackoverfolw post that might solve the issue → build context for docker image very large - Stack Overflow

I’ve tried it myself:

export DOCKER_BUILDKIT=1
docker build -t test:0.3 .
[+] Building 0.5s (8/8) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 145B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for Docker 0.0s
=> CACHED [1/3] FROM Docker 0.0s
=> [2/3] RUN mkdir /var/testdir 0.4s
=> [internal] load build context 0.0s
=> => transferring context: 622B 0.0s
=> [3/3] COPY testdir/dir3/myZipfile.zip /var/testdir 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:a69f603d533eda88b2969a36022d9249f94d5bf61dd39d5260fbda30d3e381ef 0.0s
=> => naming to Docker

… works !

root@deimos:/data/docker/test# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
test 0.3 a69f603d533e 14 seconds ago 73.9MB

Looks like you gave me two working solutions!

I knew about .dockerignore, but after your message I realized I could potentially build dockerignore on fly, which would contain just two lines - ignore everything from bin/ except the binary I need, something like
bin/*
!bin/TheBinaryINeedForThisImage.zip

I also read that link earlier and already tested DOCKER_BUILDKIT=1, but didn’t test it correctly, because I saw it still transfer the context on early stage, but it actually analyzes the docker file and transfers only what is used in COPY commands. So, this is a working solution, exactly what I need.

Thank you bro, great job!

Glad I could help :slight_smile: