Very big Dockerfile; how to simplify his maintenance?

Hello

I’m coding a very big Dockerfile (more than 1,200 of lines; lot of comments ;)); with a lot of stages and arguments.

In some parts, I’m forced to copy/paste some block of codes and that’s not good at all. So I’m thinking to the concept of templates as we have in YAML

Defaults: &defaults
  Company: foo
  Item: 123

Computer:
  <<: *defaults
  Price: 3000

Is there something like that with Dockerfile?

And additional question: how do you manage very big Dockerfile? Did you’ve some tips here? Thanks

In-depth

I’m trying to optimize my Dockerfile to use a lot of stages and improve reusing cache.
My projects are mainly PHP + JS. I’ve a stage for composer, a stage for Yarn and a lot of another stages.

In my composer stage, I’m copying the composer.* (json and lock) files in the composer image then do a few things like setting the proxy, configure to use force git@ instead of https, … then, finally, run `composer update´.

In my yarn stage, I’m copying the package.* (json and lock) files in the node image then do a few thinks (uh oh, the same things in fact) then finally run yarn install

In my PHP stages (a lot of stages), I’ve to do a few things like setting once again the proxy, once again for git, one again do things already done in another stages.

So the question: how can I reuse some parts of my Dockerfile in several places ?

Why more than 1,200 lines

Because I use that Dockerfile for all my projects. I’ve build arguments for installing or not PHP Redis, PHP GD, SOAP, XDebug, OPCache, Chrome (for testing automation) and many more.

You can only “play” with the stages and use the same parent stage in multiple stages or write a script and run it with parameters. YOu don’t have to put everything directly into the Dockerfile. An additional script can be either mounted or temporarily added and removed in the same stage or as it would be a couple of kilobytes maximum, you could keep it in a base stage.

You would need to be careful though, since using a single script for all stages would invalidate everything in the cache when a character changes in that file. So you could have multiple scripts. One for common functions and one for each stage if needed.

You could also consider using a template system and generate separate Dockerfile files where each “orphan branch” is an output of the scripts generated from the main branch: GitHub - rimelek/docker-php: PHP images based on the official versions with additional helper scripts

I used envsusbst and shell scripts, but you could use any template system you like.

If you feel you have too many arguments in the Dockerfile, you can use a single ARG with a json content or just as a list of items separated by a line break or a comma as I did here in the generated output:

You could also create a base image with a separate Dockerfile to have fewer stages in your bigger Dockerfile. It is not uncommon.

Thank you @rimelek

I’m not sure I’m more advanced… Sorry!

I’m afraid reusing block of codes like the notion of template of YAML didn’t exists yet and that there is no other ways to think long and hard to how to split the single Dockerfile in multiple stages; what I’m doing right now.

I already have over 20 of more stages in my long Dockerfile and has said, part of duplicated.

I’ve a stage based on the composer image, one on node and one on php. All standard images. In each image I should do the same things like setting the proxy.

In my composer image, if I can, I download dependencies (i.e. running composer install) so, later on in my Dockerfile, I can copy the dependencies in my PHP image and reuse cache and it’s great.

But in some cases, I can do this because download dependencies require a specific file (called composer.json for PHP) and that file can be missing in that stage. So, then, in my PHP image if my dependencies are not yet present, I should run composer install there and … yeah, I’ve the same block of code here.

This is the same story for node (the package.json can be missing in the ealier stage).

I’m trying to avoid as much as possible this situation and that’s why I’ve so many stage I think but, in some situations like having multiple base images, I don’t find any better way.

Once again, thanks for your answer and your time.