One container running another one on the same Docker engine

What is the simplest, most straightforward way for one container to run another container, with a different image? That is, similar to a *nix fork/execve or a Windows Process.Start().

As a very first solution, a Python based mechanism for doing this would be sufficient, but in the long term, we need a general way of doing it. We certainly need a blocking (synchronous) mechanism - say, running a compilation, blocking the caller until the compiler returns (like when you use the Docker client to run the compiler docker image interactively). Parallell running is a low priority option. All relevant images run a bash shell as the default CMD, so what we need is a way for one container to start another container (on the same engine) with a given bash command line, and block until it returns.

Reason for asking:

We have developers insisting that their projects cannot be restructured to do C compilation in a separate job step from Python interpreting, or document formatting. It cannot be done sequentially. In most cases, a Python script invokes e.g. C compilation and document generating. (To quote Ted Nelson: Everything is deeply intertwingled.)

So they demand one huge Docker image that can do everything. If we had a single configuration, that would be OK. We do not. We end up in an n-dimensional space with all possible combinations of C compiler and tools versions, Python versions, documentation tool versions and whathaveyou. Every combination requires its own mega-image. Rather than m * n * o ( * p * q …) images, I strive for m + n + o ( + p + q …) images, like the way we had it in pre-Docker times.

My attempts at including a Docker client in each image, supplying the IP address of the local Docker engine as a parameter to the calling container, to create sort of a loopback, is full of McGyver tricks, and to be honest: I haven’t made it work in a stable way yet. There must be simpler, straightforwards ways to do it. I cannot possibly be the first one with this need!

I suspect that the solution is found on page 2 or 3 of The Complete Docker manual, but I overlooked it :slight_smile:

Yes it is possible… As Obiwan would say, “Use the sock(et) Luke”. Said this, It is not what you are looking for. As you have described, I think you are asking about multistage buildings, running multiple builds for different compilation of binaries to build a full application. My recommendation is that always try to keep images as small as possible, with fewer binaries and libraries is better and of course, try to run different application functions isolated and smaller than a full application inside one container.

Hope it helps,
Javier R.

That is exactly what I want to achieve: Docker images of moderate size, aimed at well defined tasks. My responsibility is to provide the images for my colleagues, I am not a primary user of them.

My colleagues insiste that it takes far to much effort, or even may be impossible, to restructure their old (non-Dockerized) build jobs so that C compilation is done in one Docker image, log analysis in a Python image, and the documentation genererated by a third image. They more or less demand that I make one huge, monothic image. (For that task, multistage builds are a must, but that is not what I am after.)

The main problem is that the build&test process is managed by a Python management system that, based on supplied parameters and files, runs C compiler on a subset of files, with variable compier options, all based on selections made in Python code. The compilaton log analyzed in Python code, after which the management system runs tests on generated executables, collects test logs, combine compile/test log extracts with documentation extracted from the source files, and starts report formatting using a couple different formatters.

This Python managment is one huge, integrated system, It is not possible to break it up to alternate between Python, gcc, testmanager, documentation… images. It expects to control the compilation of each module, to supply input and capture output from each test, etc. In the non-Dockerized world, this was fairly simple, running compilers and systems under test and formatters as subprocesses with various executable images - the compiler, test and documentation tools were independent of each other. With plain Docker, we have to merge all the tools into a single huge image, to make it possible for the Python manager to activcate them any time it chooses.

I want to tear that up. When the Python code runs a gcc process, I want that gcc process to be a stub that (in a way resembling an RPC call) starts the real gcc image in a separate container, and forwards stdin, stdout and stderr to it.

If I could, from inside the container, obtain the IP address of the Docker engine, then the stub could run a loopback: Running an RSH to do a “docker run -it gcc” (or whatever image). But I have not yet found any way to get hold of the IP address. We are running on build systems (Bamboo and Jenkins) that assign jobs to any of a pool of build nodes, so the IP address of the Docker host cannot be predicted; it must be found dynamically. (127.0.0.1 does not do the trick!)

I am a little surprised that this is not a problem that has been raised hundreds of times before, leading to a selection of alternate solutions :slight_smile:

I am not sure why you want to use Docker in your context but maybe that can help: https://github.com/just-containers/s6-overlay

T.

Why we want to use Docer?

The brief answer: For smooth switching between tool versions on buld systems that handle a great number of different jobs.

We have one huge Bamboo build system, and one not quite as huge (yet) Jenkins build system. Both are shared by a large number of projects, with very different requirements to tool versions and environment setups.

Until now, for Windows jobs we have used a wizard called at the start of each job to ensure that the proper tool versions are active. In most cases it can be done by moving symbolic links. In several (and an increasing number of) cases we must resort to uninstalling the unsuitable version and install the version wanted by the current job. In a few cases, it is not possible to automatically switch between versions, so we must install one version on one subset of nodes, the other version on another subset, and flag the nodes with the versions they have. (That sort of defeats the idea of having a large pool of shared build nodes, automatically adapting to needs.)

This wizard started as a simple and clean tool about seven years ago, but has grown into a messy, unstable monster as new tools have required yet another quirk to be handled. We have something like 20 different toolboxes active right now, and maintenance is a nightmare. The archive of retired toolboxes (that might have to be dug up for bugfixing in systems delivered to customers years ago) counts between 150 and 200: If debugging is required, we must rebuild a system exactly the way the customer’s system was build, to be sure it is bit by bit identical. So every single tool (or, at least those affecting generated code) must be of exactly the same version.

One big problem is that some jobs do their own installation from the network of uknkown tool versions. Some of these have “new and exciting” ways of uninstalling them. Or, the installer breaks the symbolic links so tney cannot be switched for the next job, because they are no longer links but plain files or directories. Another problem is that some of the tools cannot be installed/uninstalled in quiet mode but requires manual intervention, and they do not lend themselves to symbolic link switching. And finally: This wizard is a WIndows only system; it cannot be adapted to Linux, which gradually comes in for our new development projects.

If we replace those twenty toolboxes by twenty Docker images, jobs are not delayed by a lot of uninstalling and installing tool versions. Tools that cannot work with symbolic links (they exist!) will work fine. Any job ruining the execution environment (e.g. by installing uknown software from the network) will make problems for noone but itself. The images can (and will, in most instances) be available in a Linux environment, and the image itself will usually be Linux based.

One nice side effect: Our desktop PCs are still Windows based - but they can run Linux containers. So by pulling the tool image to the PC, and supplying the build script to it, developers can run the build locally, exactly the same way that the Jenkins job does it, before comitting code changes.

We have had problems with Docker as well: Our first images was preserved as Dockerfiles only. When we had to debug an old delivery and wanted to rebuild the image, some of the Linux tool versions were no longer available in the LInux apt software repository. Searching for something by the same name in other repos is not guaranteed to find one that is 100% identical with the one we had before.

So now we are preserving images in binary form. We are also in the process of building a local repository of those packages we use in our environment (like the repository we have for the wizard handling Windows software), so that we can rebuild an image if someone by mistake erases the binary version. (Yes, we do make backups of the registry, but getting back a mistakenly erased image from the backup is such an involve business that we try to avoid it.)

From the maintenance point of view, Docker looks quite promising. This attempt to split up jobs into separate tasks of different nature (compiling, log analysis, test monitoring, document generation, …) is one way to succeed with the management. If we must put all the tools into one huge image, and every time someone asks for one tilny little update, in any one of the dozens of tools in the complete set, we have to generate a whole new huge image. Then we will soon have not two hundre, but two thousand retired tool sets, and each of them will be large…

One of our main projects have a tool set filling 6.9 gigabytes, out of which 5 Gbytes are project specific tools. Below it is a nicely layered structure of one base layer on top of another base layer, all cleanly created in multistage builds so they contain as little build leftovers as possible. Then this project makes a request for one extra Python package. The Python layer is in the “general” layers. Then we have two alternatives: Add a layer on top of those 6.9 G, containing nothing but a “pip install”. If this happens all the time, in multiple projects, we will have a mess of Python configurations. It would be so much cleaner to make a new version of the Python image, where this package could be made available to all future images built on a Python base. If Python tasks were handled by an “external” image, using SSH as a relay to another container, no update to the giga-image would be required (but the task requesting Python would have to update the tag value to the new image.

The other alternative is to update the Pyton layer below those 5 “private” GB of tools, so that we have anoter 5 GB layer - nothing above that updated Python layer can be reused. That doesn’t make me very happy, either.

Worst of all: I am in an ongoing battle with one manager of that project: He wants a facility to rebuild his 6.9 GB image automatically: If a job specifies a new Python package version, he want that to be detected atuomatically and a new Python layer + 5 GB “private” layer to be generated on tne fly, He indicates that this might happen several times a day… We have no way of tracing which ones of thes auto-generated images will no longer be used; we must keep all of them indefinitely. They cannot ble built locally on one Jenkins node and pruned locally: When the build is re-run, it might be placed on a different node, which will pull the image from the registry… (Note: This manager is a highly qualified software guy, not a technically ignorant administrator guy. Some excellent software developers are nevertheless completely ignorant with respect to resource costs!)