sorry, if this is a stupid question, but I am relatively new in the world of containers. My question is about the layered structure of Docker images, which makes it possible - if I understand the documentation correctly - to share layers (dependencies) between multiple images and containers, which reduces downloads and storage because it is not necessary to download and save the same layer multiple times.
But what happens during runtime in the RAM, when the container instances are created and more than one container needs the same layer? Will this layer be loaded in the memory multiple times or just one time? Or asked differently: Does the layered structure reduce memory needs in the RAM during runtime or does it only save storage and download traffic?
Thanks for your answers and sorry for my bad English. I did my best ;-D
Docker will use a storage driver like overlayfs2 to create a virtual filesystem, composed of the image layer and a copy-on-write layer and use it for the containers filesystem. In reality only the cow-layer will occupy addition diskspace.
Docker will not load any layers into memory, nor does it boot an os while starting the container. It starts one or more processes defined in the entrypoints script - nothing else.
Assumed all images use the same base image and as such the same image layers, the shared image layers will be downloaded and stored once. If you build your own images, you are in full controll to make it behave like that. But if your are going to pull diffrent images from different maintainers, there is no way to controll what exact base image version they used and you high likely would end up to pulling all images completly anyway, if they don’t share the same parent base image (as in identical sha256 image checksum, not just “the same tag”).
Thus said, docker is equiped to benefit from caching image layers, but the images you use need to support that scenario in order to archive the least use of storage and transfer.
thanks for your answer, but this is all clear. I know that containers have no own OS kernel. My question is: What happens in the RAM, when I start two containers using exactly the same layers. For example:
Image1: Alpine 3.14 + Dep. A + Dep. B + App1 → Container1
Image2: Alpine 3.14 + Dep. A + Dep. C + App2 → Container2
Both are running. Can they share the “Alpine” and the “Dep. A” layer in the same (physical) memory area and I need less memory? I heard anywhere it depends on the storage driver? I’m not sure…
Is it? Could you help me to understand why you mix image layers (which only impacts storage) and RAM/memory? Maybe you do understand something that I mised the last 7 years. I am not sure what might make you think that image layers impact your memory
Did you suppose to write storage? A specific image layer exist once, regardless how many images use it or how many containers are created that use it. That is, if you use a storage driver like overlayfs2 or aufs - I have no idea how the btrfs storage drivers handles it.
Is it possible that your objective is not related to image layers at all, but is rather a question whether processes in different containers share memory, in a way that dynamic libraries are only loaded once and shared amongst the containers? As this would break the strong isoltion, I am not suprised that it’s not on the feature list.
If you are referring to Page cache this is a higher level of understanding the OS than I claim to understand it, but I would think that it would not increase the size of the cache unless you write the filesystem inside the container. For example if you have 10 containers using the same image layers and one of the layers contain a 1 megabyte readonly database file which you constantly read from all off your containers then the overlayfs2 filesystem would check if the file is on the writable filesystem, and if not, it would find the file on one of the readonly image layers to read that file. If that file is in the page cache, in theory it would be cached only once because physically it is only on file.
If you have to write files frequently then each container would have its own initial copy of the original file to write so each file could be cached.
So I can imagine this in theory, but I don’t know how much it matters in practice, because you probably dont want to read a readonly file frequently and I don’t know how often the system files are read.
what’s going on? Why so aggressive?
What I mean is something like this I found on stackoverflow:
Replace aggressive with confused and you get my point of view
I senserly wanted to understand what layers and memory would have to do with each other. I assume your SO link holds the answer. I am currious.
Update: read it. It would have been clearer if the SO link would have been part of the first post.
If you remove the “different image layers” from the equation and simply start two containers based on the same image, you have the same situation.
I am surprised that the host side virtual memory cache will take care to share the object with different containers. Seems I learned something new today.