Cleaning ZFS Orphaned Datasets

I have my laptop and a server running Docker, both using the ZFS storage driver. Having spent a bit of time recently learning more about how container and images are organised on the file system, I realise that they both have large numbers of orphaned datasets - by that I mean datasets that are not referenced, directly or indirectly via a snapshot or clone, by any containers or images.

Questions of this ilk come up from time to time - I’ve done my research and I understand that a small number of images can give rise to large numbers of datasets. That’s fine, but there are significantly more datasets than what the containers and images reference.

I assume something happens that breaks the relationship - whether that be system reboots, upgrades, reinstalls, I don’t know. All of these things happen from time to time.

I can easily write a script to destroy the orphaned datasets but I’d like to be sure that what I plan to do is correct and that I am not missing something.

Here are some figures:

        | C [1] | I [2] | D [3] | DR [4] | O [5]
Laptop  |   14  |  6    | 117   |  44    |  73
Server  |   20  | 18    | 576   | 128    | 448
  1. Containers docker container -qa | wc -l
  2. Images docker image ls -q | wc -l
  3. Datasets zfs list -Hr system/docker | awk 'NR>1' | wc -l (doesn’t count root dataset)
  4. Datasets referenced, see bash script listing below.
  5. Orphans = D - DR
# Get dataset ancestors
dataset_ancestors() {
  until [[ "$d" == '-' ]]
    echo "$d"
    d="$(zfs get -H origin "$d" | awk -F"[\t@]" '{print $3}')"
export -f dataset_ancestors

# All datasets associated with Docker image and/or container
( docker image     ls -q  | xargs -I {} sh -c "dataset_ancestors \$(docker image     inspect \$1 | jq -r '.[].GraphDriver.Data.Dataset')" _ {}
  docker container ls -qa | xargs -I {} sh -c "dataset_ancestors \$(docker container inspect \$1 | jq -r '.[].GraphDriver.Data.Dataset')" _ {}
) | sort -u | wc -l

I’ve already done a docker system prune -a and I don’t believe there’s any build cache (I think those datasets have shorter names so are easily recognisable). All the orphaned datasets are named like the example below, some have -init appended - most don’t


Am I missing anything? Is there anything else in Docker that could own those orphan datasets that I should check? If not then I should be able to just zfs destroy the orpahed datasets (in clone/snapshot order), right?