COPY command only allows copying from host/ client to the file system being built.
I’d like to copy some files during docker build to the host/ client. I can’t have the file in the final image and it has to be done as part of the build.
Are you sure that you didn’t already leak that file through image layers?
Each Dockerfile instruction creates a new image layer. If a file is not deleted in the same Dockerfile instruction where it’s created, it will be persisted in the image layer for that instruction. Deleting a file in a following instruction is not going to delete the file from the previous layer, it will just flag it as deleted.
This looks like a typical case where what you want, might not be the solution you need. Please explain the big picture, so we can come up with the actually needed solution.
Currently, Make files for the application are run in the Dockerfile build and the generated binaries are stored in the image. Now, I’d like to move some generated binaries from this Dockerfile image to another image. These binaries are huge and can’t have it in the image but we need it for later use.
Can’t run the Makefile from the other container again since this has to be done for many other containers and the build is time consuming. There is also this constraint where we are using a build server to do this and can’t quite do the docker run, cp and exec etc.
I don’t see a way to get the files out of the image, but you could leverage multi-staged builds to minimize the final image.
You could build everything in one stage, and use a COPY --from to copy over the required files into the final image.
Only the final image is tagged, though you could use docker build --target {stagename} to use a different stage as final image.
If we assume, you have this Dockerfile:
FROM whateverbaseimagea:tag AS builder
dosomething
FROM whateverbaseimageb:tag AS binaries
COPY --from builder /source/path/from/builder/stage /target/path/in/binaries/stage
FROM whateverbaseimagec:tag as final
COPY --from builder /source/different/path/from/builder/stage /target/different/path/in/final/stage
Then build the final image: docker build -t repo1:tag . and follow it by docker build --target binaries -t repo2:tag - the 2nd build should be completely handled by the cache. As a result you would have two different images that can be used for different purposes.
If this doesn’t help, then you probably should consider to build outside the image build and just copy over the artifacts you need from the build context into the image. Maybe the build server itself has something for that purpose. For instance, if you run gitlab ci/cd, you can declare paths that should be cached or used as artifacts that can be used amongst jobs, regardless whether you use an executor that runs the jobs in a container or in a shell executor on a host.
I think the solution is indeed multi-stage build as @meyay suggested, although you can have a stage which doesn’t contain anything (FROM scratch as copytohost) and use that as target (--target copytohost) in which stage you will have a single copy command to copy the binaries to the stage. I use this to build my documentation with Sphinx and copy the result out from the image. This is my update.sh script.
As you can see I use the --output option of the build subcommand to tell Docker to copy the root filesystem of the target stage to a specified directory. I learned this from the sourcecode of Docker Compose, but they use docker buildx bake now.
It means you can use this even if your images can’t be built with a single Dockerfile with multiple stages. Then you will have one Dockerfile to build the binary, copy the binary out from the image and a other Dockerfiles (optionally multistage too) to copy the binaries to other images.
Thank you for sharing!
I was not aware that COPY --link --from creates a new image layer that only contains the copied content from another build stage, and that it’s possible to export the root filesystem of an image to the host filesystem.
The key is FROM scratch. I never tried it with --link, I just used --from to copy the files to a stage based on “scratch”. It is probably worth to mention that it works only with buildkit, but buildkit is enabled by default everywhere thanks to buildx being the default builder now.
Thanks both Meyay and rimelek. Based on my understanding of what you have shared,
Created a final stage FROM scratch as copytohost and copied the files there using COPY --from . The copied files are available in the host now if the build is executed with --output option and buildkit enabled.
Now the docker is run again with a target stage, the one before the “copytohost” stage, to get the actual image. Thus the docker build is run twice but, as you pointed out, the second build is from cache only.
Please lemme know if I interpreted this solution correctly.