I have my laptop and a server running Docker, both using the ZFS storage driver. Having spent a bit of time recently learning more about how container and images are organised on the file system, I realise that they both have large numbers of orphaned datasets - by that I mean datasets that are not referenced, directly or indirectly via a snapshot or clone, by any containers or images.
Questions of this ilk come up from time to time - I’ve done my research and I understand that a small number of images can give rise to large numbers of datasets. That’s fine, but there are significantly more datasets than what the containers and images reference.
I assume something happens that breaks the relationship - whether that be system reboots, upgrades, reinstalls, I don’t know. All of these things happen from time to time.
I can easily write a script to destroy the orphaned datasets but I’d like to be sure that what I plan to do is correct and that I am not missing something.
Here are some figures:
        | C [1] | I [2] | D [3] | DR [4] | O [5]
--------------------------------------------------
Laptop  |   14  |  6    | 117   |  44    |  73
Server  |   20  | 18    | 576   | 128    | 448
- Containers docker container -qa | wc -l
- Images docker image ls -q | wc -l
- Datasets zfs list -Hr system/docker | awk 'NR>1' | wc -l(doesn’t count root dataset)
- Datasets referenced, see bash script listing below.
- Orphans = D - DR
# Get dataset ancestors
dataset_ancestors() {
  d="$1"
  until [[ "$d" == '-' ]]
  do
    echo "$d"
    d="$(zfs get -H origin "$d" | awk -F"[\t@]" '{print $3}')"
  done
}
export -f dataset_ancestors
# All datasets associated with Docker image and/or container
( docker image     ls -q  | xargs -I {} sh -c "dataset_ancestors \$(docker image     inspect \$1 | jq -r '.[].GraphDriver.Data.Dataset')" _ {}
  docker container ls -qa | xargs -I {} sh -c "dataset_ancestors \$(docker container inspect \$1 | jq -r '.[].GraphDriver.Data.Dataset')" _ {}
) | sort -u | wc -l
I’ve already done a docker system prune -a and I don’t believe there’s any build cache (I think those datasets have shorter names so are easily recognisable). All the orphaned datasets are named like the example below, some have -init appended - most don’t
ff9f4eb9964469d8399b93c705509306d3c18a7de368c53b74546c754557fa06
Am I missing anything? Is there anything else in Docker that could own those orphan datasets that I should check? If not then I should be able to just zfs destroy the orpahed datasets (in clone/snapshot order), right?