Copying files to host from docker script during build

ft108 · May 12, 2023, 2:28pm

Hello,

COPY command only allows copying from host/ client to the file system being built.
I’d like to copy some files during docker build to the host/ client. I can’t have the file in the final image and it has to be done as part of the build.

Thanks for you help!

meyay · May 12, 2023, 5:43pm

It is not going to work!

Are you sure that you didn’t already leak that file through image layers?
Each Dockerfile instruction creates a new image layer. If a file is not deleted in the same Dockerfile instruction where it’s created, it will be persisted in the image layer for that instruction. Deleting a file in a following instruction is not going to delete the file from the previous layer, it will just flag it as deleted.

This looks like a typical case where what you want, might not be the solution you need. Please explain the big picture, so we can come up with the actually needed solution.

ft108 · May 12, 2023, 6:32pm

Thanks for the response.

Currently, Make files for the application are run in the Dockerfile build and the generated binaries are stored in the image. Now, I’d like to move some generated binaries from this Dockerfile image to another image. These binaries are huge and can’t have it in the image but we need it for later use.

Can’t run the Makefile from the other container again since this has to be done for many other containers and the build is time consuming. There is also this constraint where we are using a build server to do this and can’t quite do the docker run, cp and exec etc.

Hope this explains the context.

meyay · May 12, 2023, 7:13pm

Ah, I see.

I don’t see a way to get the files out of the image, but you could leverage multi-staged builds to minimize the final image.

You could build everything in one stage, and use a COPY --from to copy over the required files into the final image.

Only the final image is tagged, though you could use docker build --target {stagename} to use a different stage as final image.

If we assume, you have this Dockerfile:

FROM whateverbaseimagea:tag AS builder
dosomething

FROM whateverbaseimageb:tag AS binaries
COPY --from builder /source/path/from/builder/stage /target/path/in/binaries/stage

FROM whateverbaseimagec:tag as final
COPY --from builder /source/different/path/from/builder/stage /target/different/path/in/final/stage

Then build the final image: docker build -t repo1:tag . and follow it by docker build --target binaries -t repo2:tag - the 2nd build should be completely handled by the cache. As a result you would have two different images that can be used for different purposes.

If this doesn’t help, then you probably should consider to build outside the image build and just copy over the artifacts you need from the build context into the image. Maybe the build server itself has something for that purpose. For instance, if you run gitlab ci/cd, you can declare paths that should be cached or used as artifacts that can be used amongst jobs, regardless whether you use an executor that runs the jobs in a container or in a shell executor on a host.

@rimelek do you other ideas?

rimelek · May 12, 2023, 8:08pm

I think the solution is indeed multi-stage build as @meyay suggested, although you can have a stage which doesn’t contain anything (FROM scratch as copytohost) and use that as target (--target copytohost) in which stage you will have a single copy command to copy the binaries to the stage. I use this to build my documentation with Sphinx and copy the result out from the image. This is my update.sh script.

#!/usr/bin/env bash

set -eu -o pipefail

dir="$(cd "$(dirname "$0")" && pwd)"
www="$dir/var/www"

test -d "$www" && rm -r "$www"

docker build . --output "$dir/var/" --target copytohost

As you can see I use the --output option of the build subcommand to tell Docker to copy the root filesystem of the target stage to a specified directory. I learned this from the sourcecode of Docker Compose, but they use docker buildx bake now.

github.com

docker/compose/blob/77dc9b54f360939ff19a8fe43ec196df6fd2d7f3/Makefile#L61


      
          BUILDX_CMD ?= docker buildx
          DESTDIR ?= ./bin/build
          
          
all: build
          
          
.PHONY: build ## Build the compose cli-plugin
          build:
          	GO111MODULE=on go build $(BUILD_FLAGS) -trimpath -tags "$(GO_BUILDTAGS)" -ldflags "$(GO_LDFLAGS)" -o "$(DESTDIR)/docker-compose$(BINARY_EXT)" ./cmd
          
          
.PHONY: binary
          binary:
          	$(BUILDX_CMD) bake binary
          
          
.PHONY: binary-with-coverage
          binary-with-coverage:
          	$(BUILDX_CMD) bake binary-with-coverage
          
          
.PHONY: install
          install: binary
          	mkdir -p ~/.docker/cli-plugins
          	install bin/build/docker-compose ~/.docker/cli-plugins/docker-compose

github.com

docker/compose/blob/77dc9b54f360939ff19a8fe43ec196df6fd2d7f3/Dockerfile#L170


      
              exit 1
            fi
          EOT
          
          
FROM scratch AS binary-unix
          COPY --link --from=build /usr/bin/docker-compose /
          FROM binary-unix AS binary-darwin
          FROM binary-unix AS binary-linux
          FROM scratch AS binary-windows
          COPY --link --from=build /usr/bin/docker-compose /docker-compose.exe
          FROM binary-$TARGETOS AS binary
          # enable scanning for this stage
          ARG BUILDKIT_SBOM_SCAN_STAGE=true
          
          
FROM --platform=$BUILDPLATFORM alpine AS releaser
          WORKDIR /work
          ARG TARGETOS
          ARG TARGETARCH
          ARG TARGETVARIANT
          RUN --mount=from=binary \
              mkdir -p /out && \

It means you can use this even if your images can’t be built with a single Dockerfile with multiple stages. Then you will have one Dockerfile to build the binary, copy the binary out from the image and a other Dockerfiles (optionally multistage too) to copy the binaries to other images.

meyay · May 13, 2023, 7:45am

Thank you for sharing!
I was not aware that COPY --link --from creates a new image layer that only contains the copied content from another build stage, and that it’s possible to export the root filesystem of an image to the host filesystem.

rimelek · May 13, 2023, 8:27am

The key is FROM scratch. I never tried it with --link, I just used --from to copy the files to a stage based on “scratch”. It is probably worth to mention that it works only with buildkit, but buildkit is enabled by default everywhere thanks to buildx being the default builder now.

ft108 · May 15, 2023, 1:05am

Thanks both Meyay and rimelek. Based on my understanding of what you have shared,

Created a final stage FROM scratch as copytohost and copied the files there using COPY --from . The copied files are available in the host now if the build is executed with --output option and buildkit enabled.

Now the docker is run again with a target stage, the one before the “copytohost” stage, to get the actual image. Thus the docker build is run twice but, as you pointed out, the second build is from cache only.

Please lemme know if I interpreted this solution correctly.

meyay · May 15, 2023, 4:44pm

Spot on, as long as no file changes in the directory that makes up the build context during both docker build executions.

Topic		Replies	Views
Dockerfile COPY commands General docker	2	600	March 28, 2023
Copy entire layer from previous build stage in multi-stage build General docker , build	3	9568	October 17, 2017
Copy Directory From Container on Exit General	3	1772	September 23, 2023
Using COPY to incremental copy files Docker Desktop windows	2	3614	October 9, 2018
The Best Strategies to Slim Docker Images General tutorial , tips	6	1994	July 21, 2023

Copying files to host from docker script during build

Related topics