when a persistant volume is mounted by a service on a swarm worker node, il we modify the swarm service to mount another persistant volume then this can cause high IO wait on the first swarm worker when the remove the first persistant volume
Environment
We use trident + Ontap Select iscsi to be able to consume persistant volume for our services on Docker swarm clusters
Trident docker plugin should clear multipath link to unsuse persistant volume before deleting volume on Ontap backend
It’s not clear for me if that must be docker swarm who call trindent plugin on each swarm worker to do this or if docker swarm just have to call trident plugin on swarm manager and then trident plugin from this swarm manager have to call all trident plugin on every swarm worker nodes
- Trident version: 20.10.16
- Trident installation flags used: [e.g. -d -n trident --use-custom-yaml]
- Container runtime: Docker version 25.0.3, build 4debf41
- Docker Swarm mode
- OS: Rocky Linux release 9.3 (Blue Onyx)
- NetApp backend types: NetApp Release 9.8P6
procedure to reproduce issue
To Reproduce
Steps to reproduce the behavior:
start.sh to create docker swarm service with a persistant volume
Volumes
export SERVICE_TEST_VOLUME=TestVolume1
export SERVICE_TEST_VOLUME_SIZE='1gb'
vol1=docker volume inspect $SERVICE_TEST_VOLUME | wc -c
if [ $vol1 -gt 3 ]
then
echo "$SERVICE_TEST_VOLUME exists"
else
echo "Creating volume $SERVICE_TEST_VOLUME"
docker volume create --driver=netapp --name=$SERVICE_TEST_VOLUME -o size=$SERVICE_TEST_VOLUME_SIZE -o fileSystemType=ext4 -o spaceReserve=volume
docker run --rm -v $SERVICE_TEST_VOLUME:/data busybox rmdir /data/lost+found
fi
docker stack deploy -c docker-compose.yml --resolve-image=always --prune --with-registry-auth SERVICE_TEST
we deploy this service on our swarm cluster. Swarm manager starts this service on worker node A
[root@nodeA:]# mount |grep testv
/dev/mapper/3600a098056303030313f526b682f4279 on /local/docker-data/plugins/b8fe688a4fd41d4af97f5de3ce33dee1f7f862d89ba982eec79bf5c785b93c9c/propagated-mount/netappdvp_testvolume type ext4 (rw,relatime,stripe=16)
/dev/mapper/3600a098056303030313f526b682f4279 on /local/docker-data/plugins/b8fe688a4fd41d4af97f5de3ce33dee1f7f862d89ba982eec79bf5c785b93c9c/propagated-mount/netappdvp_testvolume type ext4 (rw,relatime,stripe=16)
[root@nodeA:]# multipath -ll
3600a098056303030313f526b682f4279 dm-8 NETAPP,LUN C-Mode
size=954M features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| - 4:0:0:227 sdc 8:32 active ready running -+- policy='service-time 0' prio=10 status=enabled
`- 3:0:0:227 sdb 8:16 active ready running
Then we modify the volume name to TestVolume2 and redeploy the service
export SERVICE_TEST_VOLUME=TestVolume2
The service is stopped on node A
NetApp Trident create a new volume TestVolume2
The service is started on another swarm worker node : node B
On node A we can no longer see TestVolume1 with “mount |grep TestVolume1”
But there are still some multipath info on node A
[root@nodeA:]# mount |grep testv
[root@nodeA:]# multipath -ll
3600a098056303030313f526b682f4279 dm-8 NETAPP,LUN C-Mode
size=954M features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| - 4:0:0:227 sdc 8:32 active ready running -+- policy='service-time 0' prio=10 status=enabled
`- 3:0:0:227 sdb 8:16 active ready running
then on one of the swarm manager we launch “docker volume rm TestVolume1”
[root@nodeA:~]# multipath -ll
3600a098056303030313f526b682f4279 dm-8 NETAPP,LUN C-Mode
size=954M features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| - 4:0:0:227 sdc 8:32 **failed faulty running** -+- policy='service-time 0' prio=0 status=enabled
`- 3:0:0:227 sdb 8:16 failed faulty running
[root@nodeA:~]# top
top - 18:28:57 up 1 day, 2:02, 2 users, load average: 0.80, 0.30, 0.10
Tasks: 310 total, 1 running, 309 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 0.3 sy, 0.0 ni, 82.9 id, 16.6 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 7656.0 total, 5421.9 free, 1101.3 used, 1402.1 buff/cache
MiB Swap: 6144.0 total, 6144.0 free, 0.0 used. 6554.7 avail Mem
to remove high IO wait we have to use dmsetup command
[root@nodeA:]# dmsetup -f remove 3600a098056303030313f526b682f4279
[root@nodeA:]# multipath -ll
[root@nodeA:~]# top
top - 18:29:50 up 1 day, 2:03, 2 users, load average: 0.97, 0.43, 0.16
Tasks: 306 total, 1 running, 305 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.0 us, 1.9 sy, 0.0 ni, 97.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 7656.0 total, 5454.4 free, 1070.0 used, 1400.7 buff/cache
MiB Swap: 6144.0 total, 6144.0 free, 0.0 used. 6586.0 avail Mem