Adding volumes to mediawiki image causes its Apache server to produce 403 errors

I’ve been playing around with this mediawiki image:

docker pull mediawiki:1.41.0

Here is my docker-compose.yml:

version: '3.6'
networks:
  my_network:
    external: true
services:
  mediawiki:
    image: mediawiki:1.41.0
    container_name: mediawiki
    restart: always
    networks:
      - my_network
    volumes:
      - './mediawiki/html/:/var/www/html/:rw'
      - './mediawiki/data/:/var/www/data/:rw'

Without the volumes configuration everything works as expected. I’m able to go through the initial mediawiki setup process and generate the LocalSettings.php file and move it into the container with docker cp LocalSettings.php mediawiki:/var/www/html. When I run docker compose restart the SQLite DB is created, all my settings are applied, I can browse to the web interface, make new pages, etc.

With the volumes configuration in the compose file, the /var/www/html directory in the container never gets populated if I run docker compose up --force-recreate -d. If I remove the volumes configuration and go through the setup process described above then add the volumes configuration back in and run docker compose up -d, the contents of the container’s /var/www/html directory are blow away! This causes the container’s Apache server to issue me a 403 Forbidden error when I try to access the web interface.

I’ve never had permission problems with volumes between host/container, but just to be sure it wasn’t a permission problem I gave the directories on the host 777 permissions and even tried changing the owner and group to www-data, and confirmed that the www-data user/group had the same ID between the host/container.

I’m kind of at a loss now. What is causing this?

Update 1
I’ve had a fundamental misunderstanding of how volumes work until I read this. Now I realize that the shorthand volume blocks in my config above were actually doing a bind. It was easy for me to believe this because up until now all the volumes I’ve delt with were bound to container directories that did not contain files that were actually part of the image, so I’ve never seen this behavior. Though I do have this working now, I still have not been able to solve my problem the way I want. My updated docker-compose.yml:

version: '3.6'
networks:
  my_network:
    external: true
volumes:
  html:
  data:
services:
  mediawiki:
    image: mediawiki:1.41.0
    container_name: mediawiki
    restart: always
    networks:
      - docker_net
    volumes:
      - type: volume
        source: html
        target: /var/www/html
      - type: volume
        source: data
        target: /var/www/data

After running docker compose down -v and docker compose up --force-recreate -d the mediawiki instance is working, but the whole point of me trying to configure volumes was to get access to the data in the container for backup purposes. I technically do have access to it, but it is way off in /var/lib/docker/volumes/mediawiki_html/_data and /var/lib/docker/volumes/mediawiki_data/_data. I guess I can symlink to those directories, but feels like there should be a cleaner way to get at container data and have it next to its docker-compose.yml file.

The error log may have more details on precisely why you are getting a 403.

I figured out the source of the problem and have a solution, but it isn’t a solution I like. I’ve updated my OP.

I am no mediawiki expert, but according to the image description, those are the container paths for volumes:

  • /var/www/html/images
  • /var/www/html/LocalSettings.php

Furthermore, you need to create a database dump for a consistent backup.

You introduced challenges that would not exist, if you just mapped container paths as volume, that are actually supposed to be mapped as volume. Instead, you mapped container paths that either holds the complete or at least parts of the applications. The workaround suffers from the problem that “copy back” is a one time thing that only happens when the volume is empty,

I am afraid there is no way around learning how an image is actually supposed to be used… Though, normally you can expect to find it properly document in the Docker Hub description.

You introduced challenges that would not exist, if you just mapped container paths as volume, that are actually supposed to be mapped as volume. Instead, you mapped container paths that either holds the complete or at least parts of the applications. The workaround suffers from the problem that “copy back” is a one time thing that only happens when the volume is empty

Yes, I saw the recommendation. I’ve read several times that if you have a choice between mapping directories in a volume vs mapping files, you go with directories. Volume mapped directories can be shared between containers, volume mapped files can’t. Files updated in a volume mapped directory will be updated live while a container is running, voulme mapped files won’t. Volume mapped directories are easier to configure for multiple files in the same directory (i.e. one line instead of one line per file). Volume mapped directories avoid potential mount limits imposed by the image. All that aside, I wanted access to the other files in that directory, so that is why I mapped it.

Furthermore, you need to create a database dump for a consistent backup.

You don’t need to create a dump for an SQLite DB, that is one of, if not the biggest benefit of an SQLite DB. It is all in one file and can be easily moved around.

I am afraid there is no way around learning how an image is actually supposed to be used… Though, normally you can expect to find it properly document in the Docker Hub description.

Counter point: “supposed to” implies intent, and the intent could be wrong. Said more plainly, just because an image was created a certain way doesn’t mean that is the way it should have been created, hence I’m looking for solutions to a problem that could have been solved when the image was created. I.e. the creators of the image may have intended for me to only make a volume of the images directory and LocalSettings.php file mentioned in the docs, but they could have just put files like those in /var/www/html which the image relies on into some other protected directory on the image and have them copied to the /var/www/html directory when the container starts. Everybody wins.

In the blogpost you linked in your original question I wrote about custom source paths for bind mounts.

Or there could be a way for your actual goal following the recommendations. I know the feeling of “how am I suppoed to use volumes with this badly designed folder structure”, but in my case I wanted to avoid mounting the root dir. I ended up with mounting the root dir as well and also mounting the subdirectories over the first mount. Whenever I had to update the software, I had to rename the volume of the root directory to be empty again.

In another case, I could create my own image based on a base image (I think it was dokuwiki or maybe just wordpress) and my root dir basically contained symbolic links to the actual data so I didn’t have to mount a volume to that anymore. The image had an entrypoint or command that handled the symbolic links before starting the app.

So an image can be created in a different way than you would prefer it, but then the solution should be changing the image and not starting the container with wrong parameters. Of course it all depends on what you want to achieve. If you purposely choose a way knowing the risks that’s fine. :slight_smile: But know that if you mount the root directory, you will not be able to update mediawiki easily anymore.

What do we mean by “volume mapped files”? A single file can’t be a volume, but a bind mounted file can be shared between containers.

Bind mounted files can be updated from the host too, if that is what you meant. The problem could be that when you use a text editor, those editors often create a temporary file while editing and when you save the file, they replace the original file with the temporary file. When a file is mounted into a container, the path used only as a reference to find it, but once it is mounted, the path doesn’t matter anymore. When you replace a mounted file, the container will still see the old file, because it only cares about the inode which identifies the file physically on the disk. If you configure the text editors not to create temporary files, the problem goes away, but that is considered not recommended, because if something happens while you edit the file and only a part of the new content will be saved, the file can become corrupted.

But… you can always copy the original file manually, edit the copy and redirect the content to the original file like this:

cat copy.php > original.php

So even if something happens during saving the content, you still have the copy to try again.

Another way could be creating a new image with environment variables which can be read in PHP (or any language) and the config file would just read these variables or the entrypoint script could generate the config file. It would also mean that you need to recreate the container whenever you want to change the parameters, but the recreation would happen quickly.

So I agree with you about that it is easier to work with mounted directories than mounted files, but both ways can work and when it comes to config files, it is easy to handle them in a customized image. You can also mount a directory with that single config file in it and replace the original config file with a symbolic link pointing to the custom config in the mounted directory.

Yeah thanks for that article. I need to read it through a couple more times to let it all to sink in. I did read the section you are mentioned, and it seemed to be exactly what I wanted, but I gave it a shot and wasn’t having much luck. Not sure what I was doing wrong, but I’ll probably give it another go since that seems to be the most correct way to accomplish what I want.

Typically when I’m mounting volumes it is for backup purposes. It is just easier to tarball a directory on the host machine than it is to issue a bunch of docker exec commands to get at data in the container, and I’d like to keep my container backup script customization to a minimum if possible :wink: If I can’t get custom volume paths to work then I think my only options are accessing the /var/lib/docker/volumes directory or using docker exec commands to get what I want.

I’m using the word volume in context of a compose file; ignoring the binding and/or “actual a volume” operations taking place on the back end. Specifically I’m talking about the compose shorthand syntax.

I’m usually using nano for editing and I don’t think it creates temp files. Larger files that are difficult to navigate in nano, I edit in Notepad++ and use scp to send them back.

True.

Absolutely true! Every image is opinionated, and every image can be different for the same application and service. As @rimelek wrote, you can always create your own image and make it behave the way you want. Though, mounting a bind/volume in a path that has required application parts in it, will be a problem for each and every image.

Just because we don’t like the approach and would never use it like that, doesn’t mean you can’t use it like that.

Can you keep us posted in a couple of months about whether you stuck to your current approach and share your experience, so that other users in your situation can benefit from your experience?

It actually creates one on Ubuntu with .swp extension and starting the filename with a dot, but it seems it doesn’t change the inode, so that’s good. Using Notepadd++ and scp would probably change it and when I had problem with volume mounts and inodes I think I used VSCode or maybe a JetBrains IDE.

1 Like

I’ve actually decided to go another direction. I did find a way to do what I was aiming to do in my OP before I made that decision though. It was a real hassle. The docs suggest creating volumes for the /var/www/html/images directory and the /var/www/html/LocalSettings.php file. For the images directory volume to work correctly, the host directory has to exist prior to starting the container for the first time, and it needs the correct ownership and permissions:

sudo mkdir -p images && sudo chown www-data:www-data images && sudo chmod 755 images

But creating a volume for the /var/www/html/LocalSettings.php file is much worse. It can’t exist prior to running the container for the first time; even if it is completely empty or has the exact configuration in it you want, it doesn’t matter, it will cause the container to error. That means you have to comment out that portion of your docker-compose.yml file prior to starting the container the first time. Then you have to go through the initial MediaWiki setup process in the browser. You can’t skip this even if you have a LocalSettings.php file already. Once you are done with the initial setup process you are presented with a link to download your LocalSettings.php file. So now you can copy it over to the place where your volume config says it should be then uncomment that line in your docker-compose.yml file. The file also needs correct permissions to play nice with the container, so you have to run this command:

sudo chown www-data:www-data LocalSettings.php && sudo chmod 440 LocalSettings.php

Now the volumes that the MediaWiki docs suggest will work. Unfortuantely the /var/www/data directory which contains the SQLite DB files is for some reason not suggested as a volume. Since I wanted access to those files I have to start the process over and do everything I did with /var/www/html to /var/www/data as well.

After I actually got the image up and running, I quickly started looking for other options. Trying to do some basic stuff like modify the login page, search for templates, etc. was not straight forward. The image just is not easy to manage unfortunately.

The Docker Hub search put WikiJS down near the bottom of the search results when the search term was “wiki”, and Xwiki not even on the first page of results. My initial research make it look like either of those will be a better solution for my situation.

Thank you for keeping us posted!

The clean approach for this would be to create a Dockerfile to build a custom image that uses the mediawiki image as base image, and copies your modified files into the custom image. Then build the image, push it to a registry, and finally deploy a container based on the customized image.

Practically, it could look like this:

  • you need a git repo to version control the files you need to create your own image
  • create/edit the files in this git repo in your local workstation
  • build the image on your local workstation
  • test the image locally by running a container
  • if tests are successful: commit and push the change back to git
  • push the image to a registry, either by pushing the local image, or if you use cicd by re-building the image in a pipeline and pushing the result
  • on the target machine: deploy a container based on the new image from the registry

A container should be self-contained, as such its image should provide everything that is needed to run the application, except the persistent data.

Although I’m not sure how it supports SQLite if it supports that at all, but as an alternative, you can try Bitnami’s MediaWiki image. Bitnami often creates better designed images. The description mentions a way for backups as well:

https://hub.docker.com/r/bitnami/mediawiki/