ARG changes unnecessarily busting cache?

In most coding languages, it is well-established best practice to declare constants at the top of the code file so they are clear to future readers of the code. Unfortunately, with docker, it seems I’m torn between using the best practice for readability (declare the ARGs at top of the Dockerfile) and the best practice for build performance (bust the cache as late as possible and put the ARGs just before they are first referenced).

I understand that ENV variables can’t have this behavior because they change the environment in a way that applications might be influenced by them even if they are not referenced in the Dockerfile. However, for ARGs, my own naive expectation would be that they only bust the cache when they are first referenced.

Example Dockerfile:

FROM python:3.7

# Set or override version filters, used within CI/CD to create custom image  
# tags with explicit version references, e.g. '==1.0', '>=1.4,<2.0', etc.
# Leave as blank string to use latest available versions
ARG dbt_version_filter=''
ARG meltano_version_filter=''

# Do a bunch of needed dependency installs:
RUN apt-get update && apt-get install -y -q \
    build-essential \
    git \
    g++ \
    libsasl2-2 \
    libsasl2-dev \
    libsasl2-modules-gssapi-mit \
    libpq-dev \
    python-dev \
    python3-dev \
    python3-pip \
    python3-venv

# Install the first app:
ENV MELTANOENV /venv/meltano
ENV MELTANO /venv/meltano/bin/meltano
RUN python -m venv $MELTANOENV && \
    $MELTANOENV/bin/pip3 install "meltano$meltano_version_filter" && \
    $MELTANO --version

# Install the second app:
ENV DBTENV /venv/dbt
ENV DBT /venv/dbt/bin/dbt
RUN python3 -m venv $DBTENV && \
    $DBTENV/bin/pip3 install "dbt$dbt_version_filter" && \
    $DBT --version
# ...

In this example above, it seems the current behavior is that my entire cache is busted any time I modify the args or their defaults. (Please correct me if this is not expected behavior.)

The desired behavior is that modifying the ARGs or their defaults (either in the Dockerfile or in args to docker build) would only bust the cache upon their first reference. So, using the example above, I could modify the version for the second app’s install without busting the cache of the apt-get dependencies and the first app install. And likewise, I could modify the first app’s version without busting the cache of the dependency installs.

Is there any hope that this could be implemented, is there perhaps a way to get this behavior with the current docker behavior, or is this just not possible/feasible for some reason of underlying platform?

Thanks, much.

1 Like