Hi all, I’m struggling with container startup issues after a hardware reboot on containers that have an NFS volume.
Environment
-
Ubuntu box (22.04.2) that I’m using as a Docker host
-
Docker cli (was 24.0.1 but just updated to 24.0.6 and have the same results)
-
Synology NAS w/ several NFS shares set up
I have 16 containers configured (all set to “Restart Always”) and everything works perfect until the host computer is rebooted. 6 of the containers have an NFS volume specified and all 6 of them fail to start. If I hop into Portainer after a reboot, they are all in an “Exited” state. If I simply select them and hit “Start”, all six start up just fine and do everything I expect them to do.
I’ve spent hour scouring the internet and things that apparently work for others have so far been unsuccessful for me.
After I manually start all of the containers and have them working, if I do a systemctl --all list-units | grep .mount
I see the mounted NFS drives. None of these mounts are present in this list after a reboot until I manually start the containers.
As per one suggestion I found when researching, I’ve tried testing out one drive by modifying docker.service and adding it to the “After” and “Requires” lines of the Unit section like:
GNU nano 6.2 /lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target docker.socket firewalld.service containerd.service time-set.target var-lib-docker-volumes-nas_entertainment-_data.mount
Wants=network-online.target containerd.service
Requires=docker.socket var-lib-docker-volumes-nas_entertainment-_data.mount
When that change is in place, after a reboot, the container using it still does not start. When listing all of the .mount units does indeed make a reference to it (which is more than I was seeing after a reboot previously), but it is in a not-found / inactive / dead state.
● var-lib-docker-volumes-nas_entertainment-_data.mount not-found inactive dead var-lib-docker-volumes-nas_entertainment-_data.mount
Not sure if this will show up properly, but a screen shot summary of what the above looks like:
As per another suggestion I came across, I tried creating a drop-in file with pretty much the same “After” and “Requires” settings as I had added to the docker.service file. As best I could tell, Docker wasn’t picking up the file at all (didn’t see the “not found / inactive / dead” mount like I did above), so I apparently did something wrong there, but it seems like I would have gotten the same results.
Below is an example of how I created all of the volumes and the containers that use them. Again, this combo works perfect after I simply push the Start button on the “Exited” container in Portainer. All of them continue working just fine until the point that the box is rebooted.
docker volume create --driver local \
--opt type=nfs \
--opt o=addr=192.168.26.5,rw \
--opt device=:/volume1/Entertainment \
nas_entertainment
docker run -d \
--name jellyfin \
--hostname jellyfin \
--net=vlan32 \
--ip=192.168.32.11 \
--restart=always \
-v jellyfin_config:/config \
-v jellyfin_cache:/cache \
-v nas_entertainment:/media \
jellyfin/jellyfin
This is driving me bonkers!!! I’m not incredibly Linux savvy (or Docker savvy either for that matter… been using for a couple of years, but definitely a hobbyist skillset) so it’s entirely possible that I’m overlooking something obvious. Hopeful that’s the case and it jumps out at one of you Docker ninjas.
Any ideas greatly appreciated!