How to create my own, from scratch OS image?

How to create my own, from scratch OS image? I am new to docker but not virtualization, long history with VMware, Hyper-V, KVM, even RHEV. I have done some of the basics with docker, once used LXC, so the idea of using canned images, familiar with a bit. But I plan to create BIND/Named primary and secondary servers as well ISC DHCP Server primary and secondary for a small environment. Figure this is an opportunity to learn more about docker.

So there is the question, as I understand it, I need to create a base OS image, in my case Debian 11. I could do the typical FROM alpine:latest method, but I really want to learn how to create my own base OS image, so I can control exactly what is in it. But I have yet to find an example of how to create a OS base container. An image that includes just an application from scratch is one thing, but how to do create a OS base image, which is a different animal right? Some had to at some point create the first ā€˜alpineā€™ image right?

I am trying to figure out your goal, since you apparently know how you could create an image from scratch and you donā€™t want to use an existing Debian Docker image. A Debian or an Alpine image is just a bunch of binaries and text files like configuration files, libraries, executables so it is ā€œDebianā€ because the files it uses are placed as you would find it in a Debian distribution. Someone has to create a distribution first. I never really used the first LXC. I think only from v2.0, but as far as I know, you always needed some template and newer versions made it easier to download them. Containers usually donā€™t contain the full Linux distribution, so someone also need to know which files are necessary and which could be deleted. To be honest, I donā€™t know how the filesystem of those small base images are built, but I doubt that the maintainers would build every binary from source code, so they need a template and they probably change it to optimize it, then copy the content back to an image.

If you want to build your own distribution that is something that you will not find out here :slight_smile: , but you could search for articles like this (havenā€™t read it)

Simply, I want to ā€˜rollā€™ mine own. This is a practical learning experience, as well as, I can qualify that the image I create has nothing I have not explicitly added, from a security perspective.

You raise the point, implied if not explicitly, that this is not all that practical, this is true. If I was under a time limit or deadline, I would just pull the Debian latest and go from there. But I have the time, to do it, so, maybe call it an intellectual exercise. :slight_smile:

I have a history of doing this, for example, when I wanted to use MicroPython on my own ESP8266 based IoT devices, I learned how to create my own MicroPython image. I was told at the time, that maybe 0.1%, some similar quite small number, of the MicroPython users ever created their own port of MicroPython, But it was both interesting and to be honest, tricky to figure out.

I like what you are doing for learning and i like to do the same, but I think you know everything that is required to create the Docker image so your question is rather how you can create a Linux Distribution.

Maybe this link is a good starting point: Create a base image | Docker Documentation

I honestly never did build my own image from scratch. I usually base my images on alpine, which have none or close to none vulnerabilities.

1 Like

@rimelik,
Yup, more of a customized distribution than one from completely from scratch, thinking if I can understand how debian:latest is created, will (better) understand the decisions and compromises made for how that image or build as defined.

@meyay,
Thanks for the link, i think that is the starting point for what I want to do. I still need to spend some real, serious time with docker, before I get down to the actual attempt to create a OS template, or image from scratch.

As for the vulnerabilities, that is my past career, now retired, shadowing my proposed effort. I was an Enterprise IT Virtualization Architect, and often had to research and resolve CVE issues, incorporating the fixes and updates for CVEs into my designs. So, I tend to want to create things completely from scratch when I can, because I need to understand the assumptions required. When you use a canned docker image/creation file, you inherit the mindset and thinking of the author, is not necessarily an issue, but does mean you are blindly accepting the goals and objectives of the author.

Just to illustrate my point aboveā€¦

An example, is netboot/dhcpd, I reference this only because a few minutes ago I was looking at it. It is based on Ubuntu, assumes DHCP server will be standalone. So it works, but I need DHCP server based on Debian, and I need a primary and secondary (fail over) implementation.

So, a few design questions result, do I create a single DHCP server docker image, and just run two instances, with a shared mount point for the shared configuration file that has the host reservations? I let one docker instance effective sleep until the primary fails? Or do I create a true docker DHCP server primary and secondary using OMAPI keys and true DHCP failover, and a shared mount for the common configuration file? Or do I create two DHCP docker images, with no shared storage, and use incron to do triggered cascade copy of the configuration file only when changed on the primary, like is common with DHCP servers run on hardware or VMs?

So even a simple design goal, has many design implementation questions. :slight_smile: When someone pulls my DHCP server docker image (if I ever publish it) will only see what I finally implemented, and never be aware of the trade offs I made to arrive at the final image.

The official distro base container images from DockerHub aim for a minimal image size. They usually only include the minimal required packages that make up the base of a particular distribution. Building your own distro base image makes sense, if you plan to build this image more frequently than the official image is build and donā€™t want to upgrade distro packages in your own app images to maintain a minimal image size.

Of course, each application/service image is opinionated and implements design decisions by its maintainer. There is no way around it :slight_smile:

You can find plenty of images for the same application/service, and all of them could use a different disto base image, different version of a distro base image (as in exact image based on the sha256 digest of that particular repo), start as root vs unprivileged user, run the main process as root vs. unprivileged user, provide different features, have entrypoint scripts with different functionality or even store their configuration or data in different folders.

From my experience the Git repos for most images are hosted on Github and can be found easily. You can validate their Dockerfiles and scripts that are copied into the image during the image build.

With static binaries that require no dependency at all, you could even create a distroless container image.

Yup, I have been looking at several examples on GitHub, looking at how the Dockerfiles have been authored, as a start point for learning, the way they approach the same result. For example, implementing ISC-DHCP-Server, some use environment variables, some donā€™t, etc. They end up at the some result, just different steps along the way. Some images take the effort to explicitly set the locales configuration, some donā€™t need or worry about it in detail.

Because everything in my environment is Debian based, I will be looking to ensure I have a lean Debian base image, but looking at how alpine is defined, is also of value as well.

Just as a side note, VMware Photon is interesting as well, in that Photon was created with the goal supporting Linux in VMware VMs. Photon has not take over the world, quite like VMware did with virtualization in total, but having more of the vertical stack of virtualization supported by the same vendor, has its advantages, to running other Linux distributions in VMware. Just as running Windows in Hyper-V has its advantages.

So this is why I didnā€™t get a notification :smiley: (rimelek)

I see. How would that work? Would you download the filesystem from somwehere (for example using debootstrap) and remove what you donā€™t need and change some configurations, or would it be better to have the smallest possible debian image/filesystem and add what you need?

I am just asking out of curisosity, not for suggesting anything :slight_smile:

Not at all, suggestions welcome. But you are thinking alone the same decision point I am. I see three possible methods:

  1. Standard minimum install, accept as is
  2. Standard minimum install, and remove what is not desired
  3. Do a kernel only install and add components, this is in theory possible, but I have yet to research specific details.

I am setting up a gPXE/iPXE script against a VM to test various options, so I base line the VM as needed. I could just clone the VM and roll back as well, but I like the idea of using the pPXE/iPXE method, because I can really be selective, consistently, what packages are installed at initial deployment. I did something similar to this for an enterprise stateless global deployment for fortune 10 company a few years ago. We happen to be deploying ESXi at the time, but the principle is the same for any OS. A scrip[ed build/deployment. You want it stateless, to avoid embedding defaults unless desired so that is going to take a bit of work.

Really, this may well result in something very close to the current minimum install by the OS standard deployment, but the benefit is, the understanding of why everything that is present is justified. And it should be interesting learning experience. Might be an opportunity to document the process and publish it on GitHub as well.

After your last response, I get the feeling you have been discussing distributions for the docker host, rather than distributions for container images. At least it would explain why VMWare Photon was part of the conversation at one point. A container has no kernel and does not boot or start any system services - it only executes the ENTRYPOINT and/or CMD instruction declared in the Dockerfile of the image (or override values provided when starting the container).

You might want to take a look at this fabulous self-paced Docker training: Introduction to Containers. It gives a solid foundation about container concepts and how things are done in docker.

Actually just an example of methodologyā€¦ the goal is still to have an application specific container, just based on Debian bones rather than something else. Many of the examples I find of existing images as based on alpine or even Ubuntu. So, I want as lean as I can have (for my purpose) base image on Debian. The interesting thing is that as many examples that exist for how to create a container, all start with a given base image, that base image was made with assumptions and trade offs, that I want to understand. So what better way to learn about such than in creating your own base image. As I noted above, call it an intellectual exercise. :slight_smile: