Docker Community Forums

Share and learn in the Docker community.

Basic container concepts: How to structure a CI/CD pipeline agent


I’m busy learning Docker, by doing, so please forgive me for the basic questions.

If I was building a website, I get that each “task” should be in its own container, e.g NGINX, Apache, DB and I could write a compose file to “group” them into a single service.

However, I have a task to create a GitHub Actions runner with Puppet’s PDK and client-tools as a container and I am struggling with some basic container concepts.

I am using my own Nexus private repo and have written my own Dockerfiles based on ones found around the Internet. I guess this too, is defeating the concept somewhat. I have done it to save waiting for containers to download each time I use them and as part of my learning.

Container 1

I have built a GitHub Actions runner image using this guide. It works in that I see it registered in GitHub and I can use it in my pipeline, however, it doesn’t have the required tools on it, namely PDK. It has some environment variable that allow it to register to my GitHub account.

Container 2

I have created another image using Puppet’s PDK Dockerfile as a reference, however, I have replaced their FROM value with the “path” to my Container 1 (GitHub runner) and installed client-tools.

If I use the PDK example as is, the entrypoint is set to the PDK executable, so when I try and start the container and look at the logs, I see it telling me how to use the pdk command. I get why it’s doing this, as I am not giving pdk any arguments. It feels as though this is designed to be used once for each command and looking at some of the examples, I need to run docker run -i -t puppetlabs/pdk:latest [some pdk command] each time I want to use it. This seems rather inefficent to me.

Given the above, should I:

  1. Write my own Dockerfile that includes all the tools I need.

  2. Use Container 1 (GitHub runner) as my base and use the GitHub Docker action to use the PDK container on the fly? I think this is know as nesting containers.

  3. Build Container 2 from Container 1, but change the entrypoint to a shell, perhaps use the same entrypoint as specified in Container 1?

  4. Write a Dockerfile that utilises multi-stage builds feature?

  5. Use a different concept altogether?

Container 1 has some environment variables. Should I declare these when running Container 2?

Any help or pointers would be greatly appreciated.

T. I. A.

CI/CD is a term that is often heard alongside other terms like DevOps, Agile, Scrum and Kanban, automation, and others. Sometimes, it’s considered to be just part of the workflow without really understanding what it is or why it was adopted. Taking CI/CD for granted is common for young DevOps engineers who might have not seen the “traditional” way of software release cycles and, hence, cannot appreciate CI/CD.

CI/CD stands for Continuous Integration/Continous Delivery and/or Deployment. A team that does not implement CI/CD will have to pass through the following stages when it creates a new software product:

The product manager (representing the client’s interests) provides the needed features that the product should have and the behavior it should follow. The documentation must be as thorough and specific as possible.
The developers with the business analysts start working on the application by writing codes, running unit tests, and committing the results to a version control system (for example, git).
Once the development phase is done, the project is moved to QA. Several tests are run against the product, like User Acceptance Tests, Integration Tests, performance tests among others. During that period, there should be no changes to the code base until the QA phase is complete. If there should be any bug, they’re passed back to the developers to fix them, and hands the product back to QA.
Once QA is done, code is deployed to production by the operations team.
There is a number of shortcomings for the above workflow:

First, it takes so long from the time the product manager makes her request until the product is ready for production.
It’s harder for developers to address bugs in code that has been written since quite a long time like a month or more ago. Remember, bugs are only spotted after the development phase is over and QA phase starts.
When there’s an urgent code change like a serious bug that needs a hotfix, the QA phase tends to be shortened due to the need to deploy as fast as possible.
Since there’s little collaboration between different teams, people start pointing fingers and blaming each other when bugs occur. Everybody starts caring only about his/her own part of the project and lose sight of the common goal.
CI/CD solves the above problems by introducing automation. Each change in the code once pushed to the version control system, gets tested, and further deployed to staging/UAT environments for even further testing before deploying it to production for users to consume. Automation ensures that the entire process is fast, reliable, repeatable, and much less error-prone.