Caching images and layers on GH Actions workflow

We are attempting to establish a cache for the docker run command to avoid the repetitive downloading of images each time our pipeline is executed. This effort is primarily to instantiate a database instance for use in integration tests by our service.

From my understanding, Docker on Linux stores the image cache within the /var/lib/docker directory. Our initial approach involved preserving the contents of the /var/lib/docker folder across multiple executions. However, this method resulted in the first execution successfully downloading and using the image, whereas subsequent attempts led to a failure, accompanied by an error message stating “layer does not exist”.

This suggests that the folder’s contents may not be consistently valid across different containers. Could you provide insights into why this issue occurs and recommend strategies to effectively cache images and layers for successive pipeline runs?

Thanks

Yes, the default Docker data directory is /var/lib/docker, but in a CI/CD pipeline it is not that simple. Each provider can have different solutions for cache. Sometimes it is called “artifact” or just “input” and “output” depending on what you want to do with it. Docker also supports remote caches. It will require network traffic. You can store the cache in the registry and download it when you build the image. Depending on what the build process is and how fast the network is, it could be faster than building the image. There is a --cache-to and --cache-from option and both can refer to a cache in a registry.

For GitHub actions, you can read their documentation

and maybe this blogpost: