Best practices for getting code into a container (git clone vs. copy vs. data container)

I would like to know how you get data (primarily source code) into your containers. From what I saw on different images there seem to be mainly three different approaches:

  1. Using RUN git clone ... in a Dockerfile and build the image each time the source code changes.
  2. Get the source code to the host and use COPY . /whatever in the Dockerfile.
  3. Get the source code to the host and use docker run -v $(pwd):/whatever/

Does any of these alternatives have severe drawbacks or problems? Some authors are suggesting that git clone in a Dockerfile does not always behave like expected (getting the latest version of the source code).

I would like to use the method for my development and build containers. Thus I like to have a solution working on my local machine, but also in combination with continuous integration platforms like Travis and others.

Thanks and best regards
Jan

12 Likes

I haven’t been using Docker for long and cannot really give credible advice but I do believe that there isn’t one size that fits all. It really depends on the use case.

In my only image I published so far only the 3rd option was feasible because the user is expected to make (uncommitted) changes to the Git repository before she starts the Docker container that uses the mounted repository.

If you don’t need to modify the code in the repository I’d go with 1 because it’s more convenient than 2. I don’t really see a use case for 2.

I work with docker for 9 months.

In our case we use COPY method for production images, cause our goal is to deliver working application without any other applications like Git. I want to run docker-compose and it should be all there my application container with all builded assets and vendors, my server container and database container.

On the other side we mount our project directories during development process cause it’s easier and much faster for us to work with code.

We never use git clone, beacuse using it you are not able to pull specific commit with easy way in CI and you are not interested in latest commit, cause during build on CI new commit can appear.

7 Likes

Hi @zmijewski
I m trying to start to use docker in my development process up to production.
Could u pls elaborate more about your development process? Mounting the project directory seems to be great option but I m not sure what to do exactly. I want to have my dev environment (for example tomcat server inside a container) with:

  1. Debuging enabled (maybe with remote debug?)
  2. CI
1 Like

Ok I will try to explain it the best as I can.

To use docker in development environment we create Dockerfiles usualy based on debian or alpine distributions casue they have the lightest weight. We install required packages in your case that would be tomcat and java probably and so on. For more info check this place http://docs.docker.com/engine/articles/dockerfile_best-practices

Then we use docker-compose to run containers with volumes pointing our project directory. Check more here https://docs.docker.com/compose/

I do not work with Java so I have no clue how to debug Java applications. Probably you are doing it through some kind of IDE like NetBeans, Eclipse or InteliJ. To do so with docker it is impossible it unless you install java on your host. In ruby we debug with byebug and we do it in container without any effort at all.

As far as CI is concern there are few approaches to run tests in that environment. You can use Jenkins to build semi production image with Dockerfile, then run tests and finaly make fully production image with downloaded vendors and whole ur application.

In perfect world you should run proccess per container to ease scaling. So for one simple web application you probably want to have one container with nginix, one with web application and all vendors and one with database and probably one with persistent data storage for database :smile:

I hope that help you a bit.

5 Likes

Hi @zmijewski,

Thank you for your effort and time for such detail answer. I will try first with remote debugging and then ??? I m not sure yet

I’m still new to Docker and learning it.

What if we don’t need to change anything in server configuration or anything related to docker,
but we only have changed the source code.

Should we still create new docker image with source and deploy that image
or just deploy new source code tag which contains source code changes only?

If we deploy docker image every time, doesn’t it increase size of docker container on production?

So does that mean your code lives in a docker hub as an image definition?

Hello @dgdosen ,

If you’re currently using docker hub than yes, that is the case. Please do remember that docker hub offers only one private repository for free. If you have multiple private projects, you’ll have to pay for docker hub. There are also alternatives to docker hub. None of which i’ve tried though.

Hey @kulinchoksi

Containers are ephemeral. Meaning that the lifetime of a container is short. You should be able to just destroy a container, and run a new one that will behave exactly the same easily with minimal configuration.

I think you should read up on how docker uses “copy on write” to save space. In a nutshell it means that if you run multiple containers from the same image they all share the same files. Files are only copied if a change is made to a file. So running 100 containers or 1 doesn’t increase space used on your file system. It’s a really interesting concept.

Hey @zmijewski and @winmintun,

I also use docker containers as jenkins slaves for building and testing code.

I’m curious though about what you mean when you say that for a web application you have a separate nginx and web application container? You mean that you use NGINX as a reverse proxy to your web application container? Because both should run a web server making the NGINX container optional depending on requirements.

Some clarification would be nice :slight_smile:

Use data volumes.
https://docs.docker.com/engine/tutorials/dockervolumes/

What if we want go pull code from a private repo and build this code at go build time .
To goal this,i have do some tries.
1.Do RUN ADD /root/.ssh/id_rsa /root/.ssh/id_rsa
2.Use shell script in Docker file,the script do-ssh.sh is:

eval "$(ssh-agent)" && ssh-agent -s
chmod 0600 /root/.ssh/id_rsa
ssh-add /root/.ssh/id_rsa
git clone git@103.6.128.106:fw/product.git -b new

then, RUN do-ssh.sh
id_rsa` was add into agent,but it still do not work

I have aready read this issue:https://github.com/docker/docker/issues/6396
It seem that there have’t any way to do pull from a private repo successfully when docker build
Must i compile my code at docker run time?

This is such a great question that I wrote I little post about it. In short I download the repo via wget and copy the data in the Docker image.

Here are the detail about how I do it: http://pascalandy.com/blog/best-practices-for-getting-code-into-a-container-git-clone-vs-copy-vs-data-container/

Cheers!
Pascal | Twitter

@devmtl:
May I ask why you post a dead link?

Thanks for having let me know! Few weeks ago, I updated my blog URL from blog.pascalandy.com/ to pascalandy.com/blog/

I update the link above :slight_smile:

Cheers!

devmtl,

I’m looking at your code at firepress.org…

I’m more than confused by what I’m looking at. I don’t recognize any of that code. What language is that? It doesn’t appear to be linux shell / bash commands… Is that a docker-compose file? What type of file is that, and how do you run it? Is this Java? I’m way confused…

The system talks about “Part two”. I’m sort of old fashioned. When I see part two, I immediately think there must be a part one somewhere. What is Part one, where is it, and why wasn’t it included?

No offense intended, but I’m totally lost. This may be an awesome technique, but without appropriate context the example is more than confusing.

Many thanks for any clarification you can offer,
Zip.

You are looking at a basic Dockerfile :slight_smile:

I used a lot of echo to debug stuff so yes, it’s cleaner in my prod Dockerifle.

Part II in this case means, to add another theme, add the instruction there. Hope it helps :slight_smile:

Hi, Thanks for sharing the thoughts.
In my case, I have a container running my application. The application consumes configuration file. The configuration file must be generic and may be updated based on the requirements. So, we have the configuration in the git repository and any modifications to that is being pushed to code / git repository. My goal is to make the container consume whatever the latest configuration available in the git (may be after a restart of container). 'm thinking to create a Dockerfile with the following contents

FROM ubuntu
ADD .ssh /root/.ssh // For the private and public key
RUN mkdir /usr/… // For copying the known_hosts file
ADD known_hosts /usr/… // Copy the known_hosts file.
RUN mkdir /opt/myservice/code_to_be_synced
RUN cd /opt/myservice/code_to_be_synced
RUN git clone username@gitrepo // This is to clone the code base
ADD entry.sh /etc/
ENTRYPOINT /etc/entry.sh

entry.sh will have the lines of code to pull the latest changes from git repository (already cloned) and replace the existing configuration file with the latest available in Git repository.

Let me know if this is the suggested way.

Thanks,
Dinesh