What happens when there is a power failure while “docker pull”?
I’m asking because of articles like this: [1]
“Adds the ability to perform atomic and durable image pulling, meaning you won’t end up with a corrupted container if power is lost in the midst of an update.”
Each image (represented by repo:tag or repo:@sha256) has a manifest, which consists of a list and order of image layers and their checksum value. Each layer is stored in a seperate archive.
Each layer is downloaded, it’s checksum is verified, gets extracted → the layer is considered pulled
Once all layers are pulled, the image is concidered pulled. Docker usually downloads severall layers in parallel - but only extracts one at a time.
In case of an outage, the already pulled image layers will be still available, layers beeing distrupted while processing are pulled again, layers that haven’t been pulled are downloaded, verified and extracted. Once all image layers pulls are completed the image is concidered pulled.
I can agree on this beeing a durable approach. This approach neither looks atomic, nor transactional to me… Though the decision that leads to an image beeing considered as pulled, probably is atomic.
Update:
Though, If I look at the feature list of the balenaEngine, it seem to use it’s own strategy:
** Adds the ability to perform true container deltas, 10-70 times more bandwidth efficient than the standard layer-based Docker pull.
** Adds the ability to perform atomic and durable image pulling, meaning you won’t end up with a corrupted container if power is lost in the midst of an update. This is not something one plans for in a data center but is a daily occurrence for devices in the physical world.
** Is conservative about how much it writes to the filesystem, performing on-the-fly extraction of pulled layers and avoiding writing the compressed layer to disk
Even though my response is valid for docker, it high likely is inacurate for the modifications balena did for their forked engine.
Thanks for the reply. This is along the lines as I would have expected.
So how would someone use a container runtime on an Embedded System without hacking it? It looks like container engines cater to the cloud but not so much to the Embedded market, although this seems to be an emerging market.
I guess non-atomic software/container updates could also be problematic in the PC/server/cloud market.
The current non-software solution seems to get a UPS But this is not possible for embedded systems.