Dockerfile RUN chmod does not work

I’m working on a project for a home webservice and I am setting up a bind9 container using docker-compose.
The issue I have encountered is from the Dockerfile directives. I need the /etc/bind directory in the container to be group owned by the ‘bind’ group and have group write permissions otherwise bind throws an error saying it cannot write and the process exits (not good).

To accomplish this I use the
RUN chown root:bind /etc/bind ; chmod g+rwx /etc/bind
directive on my Windows development machine. Ignore that I’m currently giving more than just write permissions.
Results for directory /etc/bind:
drwxrwsr-x 2 root bind 4096 Sep 18 10:08 bind
It works fine on my Windows 10 development machine. Cool. Let’s test it on my Ubuntu 20 testing machine.

I move the files over through my gitlab.
I make sure the previous test is gone:
$ docker-compose -f ns1.docker-compose.yml down
$ docker-compose -f ns1.docker-compose.yml ps # Just to be sure

I make sure all previous volumes are gone:
$ docker volume prune
$ docker volume ls # Just to be sure

Build those containers
$ docker-compose -f ns1.docker-compose.yml build

Hop into the container to check the permissions of the directory
$ docker-compose -f ns1.docker-compose.yml run bind9 /bin/bash
The permissions are equivalent as the chmod command did not even run. I don’t know why it will not run the command.
results for /etc/bind on Ubuntu 20 testing machine:
drwxr-sr-x 2 root bind 4096 Sep 18 21:44 bind

Note: The group owner is bind group indicating the first half of the RUN command worked but the second half which changes the group permissions does not work.
(EDIT: It’s also worthy of noting that I tried multiple combinations of the RUN directive within my Dockerfile:
RUN command1 && command2
RUN command1 ; command2
RUN command1
RUN command2
None have worked for command 2, not even splitting them up seperate)

Issue: The same Dockerfile produces different results on different machines with the same version of Docker.
One is Windows using Docker Desktop so I must use ‘docker compose’ vs the linux version: ‘docker-compose’

My development machine is running Windows 10 Pro Version 20H2 OS build 19042.1237
Docker version 20.10.8, build 3967b7d

Ubuntu testing machine running $ uname -a on this machine gives:
5.11.0-34-generic #36~20.04.1-Ubuntu x86_64
Docker version 20.10.8, build 3967b7d
docker-compose version 1.29.2, build 5becea4c

I’ve cut out all unnecessary files to recreate the issue and put the necessary files into their own public github repo for easy testing:

commands I use to recreate the issue on my Ubuntu 20 machine after cloning and cd’ing into repo:

make sure all previous volumes with the same name referenced in this Dockerfile are gone if you’ve run this before:

$ docker volume prune
$ docker volume ls

build from compose

$ docker-compose -f ns1.docker-compose.yml build

Run a bash to get into the container

$ docker-compose -f ns1.docker-compose.yml run bind9 /bin/bash

cd in /etc directory and ls -l and find /bin permissions:

$ cd /etc
$ ls -l

Your help in resolving this issue is much appreciated.
I think one workaround for this since the issue is from a persistent volume is to run a bash in the container with ‘-u 0’ to be root and change permissions by hand on the first run. The permission should persist and bind9 service should run fine afterwards.

Could this be a difference between (implicitly) using BuildKit on only one of the platforms? What if you use docker-compose with a dash on Windows too? Or: just ensure your images have been built before using Compose. Not really my cup of tea though, except for being aware that it may not be the exact same versions.

On an Intel Mac, the Docker official BIND 9 without any RUN chown shows:

docker run --rm store/internetsystemsconsortium/bind9:9.11 ls -la /etc/bind
total 56
drwxr-sr-x 2 root bind 4096 Sep 19 10:27 .
drwxr-xr-x 1 root root 4096 Sep 19 10:27 ..
-rw-r--r-- 1 root root 1859 Aug 20  2020 bind.keys
-rw-r--r-- 1 root root  237 Aug 20  2020 db.0
-rw-r--r-- 1 root root  271 Aug 20  2020 db.127
-rw-r--r-- 1 root root  237 Aug 20  2020 db.255
-rw-r--r-- 1 root root  353 Aug 20  2020 db.empty
-rw-r--r-- 1 root root  270 Aug 20  2020 db.local
-rw-r--r-- 1 root bind  463 Aug 20  2020 named.conf
-rw-r--r-- 1 root bind  498 Aug 20  2020 named.conf.default-zones
-rw-r--r-- 1 root bind  165 Aug 20  2020 named.conf.local
-rw-r--r-- 1 root bind  846 Aug 20  2020 named.conf.options
-rw-r----- 1 bind bind   77 Sep  3  2020 rndc.key
-rw-r--r-- 1 root root 1317 Aug 20  2020 zones.rfc1918

Note that on the first line, for an Intel Mac, /etc/bind is already using group bind in the base image.

What if you run the above command on your Ubuntu environment?

Comparing that to using:

FROM store/internetsystemsconsortium/bind9:9.11

RUN chown root:bind /etc/bind ; chmod g+rwx /etc/bind

Then, after docker build -t my-bind9 . on a Mac, it seems the chown root:bind /etc/bind was not needed (unless you intended to recurse into the folder itself, in which case your chown command is wrong?). And the chmod g+rwx /etc/bind indeed worked, like on your Windows machine:

docker run --rm my-bind9 ls -la /etc/bind
total 56
drwxrwsr-x 2 root bind 4096 Sep 19 10:26 .
drwxr-xr-x 1 root root 4096 Sep 19 10:26 ..
-rw-r--r-- 1 root root 1859 Aug 20  2020 bind.keys
-rw-r--r-- 1 root root  237 Aug 20  2020 db.0
-rw-r--r-- 1 root root  271 Aug 20  2020 db.127
-rw-r--r-- 1 root root  237 Aug 20  2020 db.255
-rw-r--r-- 1 root root  353 Aug 20  2020 db.empty
-rw-r--r-- 1 root root  270 Aug 20  2020 db.local
-rw-r--r-- 1 root bind  463 Aug 20  2020 named.conf
-rw-r--r-- 1 root bind  498 Aug 20  2020 named.conf.default-zones
-rw-r--r-- 1 root bind  165 Aug 20  2020 named.conf.local
-rw-r--r-- 1 root bind  846 Aug 20  2020 named.conf.options
-rw-r----- 1 bind bind   77 Sep  3  2020 rndc.key
-rw-r--r-- 1 root root 1317 Aug 20  2020 zones.rfc1918

So, what if you run the above on your Ubuntu machine?

If you want people to test then please show us a reproducible example for a Dockerfile and docker-compose.yml using only a public image. Oh, I missed you created a GitHub repo. I did not look into that, as you may have guessed.

Ah, your GitHub repo shows VOLUME /etc/bind. That may be the culprit? I’ve not investigated, and I see you’ve been pruning the volumes for your tests, but still maybe reading about what happens for empty volumes helps: Populate a volume using a container.

Hey thank you for getting back to me and I apologize for waiting several days I’ve had a busier week than normal.
Alright, code.

For some reason I had in my head that I could not use docker-compose on Windows machine and that it had given me errors but checking it now it works fine. Huh. I guess I can’t even trust myself.
From reading here: docker: 'compose' is not a docker command when installing using convenience scripts · Issue #8630 · docker/compose (github.com)
I would think there is no difference between using ‘docker-compose’ command vs ‘docker compose’ command.
Even so I will test the difference.

Test 1: Using the Windows machine and Docker Desktop I’m gonna run one test using docker-compose and another using docker compose to see if that is where the output difference is coming from.
Using the same dockerfile provided in the demonstration github repo:

FROM internetsystemsconsortium/bind9:9.16

VOLUME /etc/bind

COPY  ./master/ns1.named.conf   /etc/bind/named.conf
COPY  db.example.internal     /etc/bind/db.example.internal

RUN chown root:bind /etc/bind ; chmod g+rwx /etc/bind
RUN chown root:bind /etc/bind/named.conf ; chown root:bind /etc/bind/db.example.internal

#USER bind

Run 1 Commands:

docker volume prune
docker-compose -f .\ns1.docker-compose.yml build   #Result: Success
docker-compose -f .\ns1.docker-compose.yml up      #Result: Success
docker-compose -f .\ns1.docker-compose.yml down  #Result: Success

Run 2 Commands:

docker volume prune   # Just to be sure
docker-compose -f .\ns1.docker-compose.yml down   # Just to be sure
docker-compose -f .\ns1.docker-compose.yml up build   # Success
docker-compose -f .\ns1.docker-compose.yml up   # Success
docker-compose -f .\ns1.docker-compose.yml down   # Success
docker volume prune   # Cleanup

Results: On the Windows system both commands produced the same results.
Note: Docker Desktop which I use on Windows did get an update this week to move to v2.0.0-rc.3

Test 2: I think it’s interesting that the chown command is not needed for a Mac. I’m going to test the same on my Windows and Ubuntu Machines. The result on both the linux and the Windows machines is that the container is automatically owned by the bind group. That’s good news

Test 3: No longer using docker-compose but using just docker and using the following Dockerfile. Both systems are running Docker version 20.10.8, build 3967b7d

FROM internetsystemsconsortium/bind9:9.16

RUN chmod g+rwx /etc/bind

Windows:

docker image build -t mybind:latest .
# Builds fine
 => [internal] load build definition from Dockerfile                                                               0.0s
 => => transferring dockerfile: 31B                                                                                0.0s
 => [internal] load .dockerignore                                                                                  0.0s
 => => transferring context: 2B                                                                                    0.0s
 => [internal] load metadata for docker.io/internetsystemsconsortium/bind9:9.16                                    1.9s
 => [1/2] FROM docker.io/internetsystemsconsortium/bind9:9.16@sha256:741c12d794f1af570898d37288635366ead7d9a1ee4a  0.0s
 => CACHED [2/2] RUN chmod g+rwx /etc/bind                                                                         0.0s
 => exporting to image                                                                                             0.0s
 => => exporting layers                                                                                            0.0s
 => => writing image sha256:9cf71c8cd1ff3bab424702906965b1d597773d447820bb915fd8e19a44be44b8                       0.0s
 => => naming to docker.io/library/mybind:latest

docker run mybind:latest ls -la /etc/bind
# Results: Looks good.
total 56
drwxrwsr-x 2 root bind 4096 Sep 26 01:59 .
drwxr-xr-x 1 root root 4096 Sep 26 01:59 ..
-rw-r--r-- 1 root root 1991 Sep 16 07:55 bind.keys
-rw-r--r-- 1 root root  237 Sep 16 07:54 db.0
-rw-r--r-- 1 root root  271 Sep 16 07:54 db.127
-rw-r--r-- 1 root root  237 Sep 16 07:54 db.255
-rw-r--r-- 1 root root  353 Sep 16 07:54 db.empty
-rw-r--r-- 1 root root  270 Sep 16 07:54 db.local
-rw-r--r-- 1 root bind  463 Sep 16 07:54 named.conf
-rw-r--r-- 1 root bind  498 Sep 16 07:54 named.conf.default-zones
-rw-r--r-- 1 root bind  165 Sep 16 07:54 named.conf.local
-rw-r--r-- 1 root bind  846 Sep 16 07:54 named.conf.options
-rw-r----- 1 bind bind  100 Sep 21 19:30 rndc.key
-rw-r--r-- 1 root root 1317 Sep 16 07:54 zones.rfc1918

Linux:

docker image build -t mybind:latest .
# Built fine
Sending build context to Docker daemon  2.048kB
Step 1/2 : FROM internetsystemsconsortium/bind9:9.16
 ---> 225a67715eb2
Step 2/2 : RUN chmod g+rwx /etc/bind
 ---> Running in 70f313e8dbaa
Removing intermediate container 70f313e8dbaa
 ---> 7bc872eeee26
Successfully built 7bc872eeee26
Successfully tagged mybind:latest

docker run mybind:latest ls -la /etc/bind
# Ahhhh there it is.
total 56
drwxr-sr-x 2 root bind 4096 Sep 26 02:09 .
drwxr-xr-x 1 root root 4096 Sep 26 02:09 ..
-rw-r--r-- 1 root root 1991 Aug 20 12:41 bind.keys
-rw-r--r-- 1 root root  237 Aug 20 12:40 db.0
-rw-r--r-- 1 root root  271 Aug 20 12:40 db.127
-rw-r--r-- 1 root root  237 Aug 20 12:40 db.255
-rw-r--r-- 1 root root  353 Aug 20 12:40 db.empty
-rw-r--r-- 1 root root  270 Aug 20 12:40 db.local
-rw-r--r-- 1 root bind  463 Aug 20 12:40 named.conf
-rw-r--r-- 1 root bind  498 Aug 20 12:40 named.conf.default-zones
-rw-r--r-- 1 root bind  165 Aug 20 12:40 named.conf.local
-rw-r--r-- 1 root bind  846 Aug 20 12:40 named.conf.options
-rw-r----- 1 bind bind  100 Aug 25 14:43 rndc.key
-rw-r--r-- 1 root root 1317 Aug 20 12:40 zones.rfc1918

So that’s an interesting find. It’s not even with Docker compose. It’s with Docker. Both systems are running the same version. Same build. One produces one result, another produces a different result.
I’m thinking I should post this as an issue in the Docker github.