Measurements of impact of combining RUN/LABEL commands?

I’ve read in multiple places about the advice for combining RUN and LABEL commands (others?) to minimize layering, for a “more efficient image”. That seems reasonable, but AUFS supposedly makes this kind of thing pretty efficient. has anyone actually measured the efficiency difference and reported on it?

I think there are two practical reasons to want fewer layers:

  1. If you have a step that downloads things then tries to clean up after itself, those must be in the same RUN step (a standalone RUN rm ... layer is useless)
  2. There’s a limit of (IIRC) 127 layers, and if your setup is especially complicated you could bump into this

Occasionally it’s helpful to have more:

  1. If you’re actively developing a Dockerfile, layer caching can skip the first four steps of a six-step sequence if you have separate RUN commands, but not if they’re all squished together
  2. If you have very large (gigabyte-sized) layers, they can become unwieldy in a couple of ways, and having a single-purpose giant COPY layer or even splitting it into several layers merely hundreds of megabytes large can make docker push more reliable

I’d imagine the actual performance impact on disk I/O of the running container is negligible and have never thought to look.

[dmaze] dmaze https://forums.docker.com/users/dmaze David Maze
https://forums.docker.com/users/dmaze
June 21

I think there are two practical reasons to want fewer layers:

  1. If you have a step that downloads things then tries to clean up
    after itself, those must be in the same |RUN| step (a standalone
    |RUN rm …| layer is useless)
  2. There’s a limit of (IIRC) 127 layers, and if your setup is
    especially complicated you could bump into this

I could use some clarification of the first point.

Concerning the limit, I believe it’s 42 (someone with a sense of humor
there?), so it’s even worse than that.

Occasionally it’s helpful to have more:

  1. If you’re actively developing a |Dockerfile|, layer caching can
    skip the first four steps of a six-step sequence if you have
    separate |RUN| commands, but not if they’re all squished together
  2. If you have very large (gigabyte-sized) layers, they can become
    unwieldy in a couple of ways, and having a single-purpose giant
    |COPY| layer or even splitting it into several layers merely
    hundreds of megabytes large can make |docker push| more reliable

I’d imagine the actual performance impact on disk I/O of the running
container is negligible and have never thought to look.


The overall advantages of layering are pretty clear.

Let’s say your Dockerfile says

RUN apt-get update
RUN apt-get install nginx
RUN apt-get clean
RUN rm -rf /var/lib/apt/lists

Each RUN command makes a layer. The first two download things. Even though the last two layers delete the things the first two downloaded, the first two layers are still part of the final image, including all of the downloaded content. If instead you say

RUN apt-get update \
 && apt-get install nginx \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists

then all of this happens in a single layer, and the intermediate package lists and .deb packages aren’t in the final image.