So I’m new to Docker. I’ve figured out how to create containers and how to create images, but I’ve been reading that I should be using Docker files. So far though, I haven’t really been able to figure out why.
I get the theoretical advantage: You have a list of commands that you can review and edit at any time, making it very clear how the container has been created (as long as I’m correct that you actually use Docker files to create containers), but in practice, I’m just not getting it.
Let’s use an example: I want to have a Docker container running Apache (httpd), nothing else, at least not that I’ve thought of so far. If I’m not mistaken, the Docker file for this container would contain just one line: FROM httpd:latest. I don’t quite see how that’s different from just typing run ... httpd:latest on the CLI. I’m guessing it’s not different?
So when should I use a Docker file? Is it only really useful for a setup where I’d want (for example) Apache, NodeJS and an MQTT broker in the same container? I guess there’s no pre-existing image that would create such a container, so a Docker file would then be a consistent way of creating that image in exactly the same way every time?
You need dockerfiles when you want to build a new image yourself or want to modify an existing one.
If you just want a redis or nodejs container, you can just use the official image and you are good to go (but how did they create these image? (dockerfiles))
But what if you have developed you own java project and want to dockerize it. There won’t be an official image for that and thus you need to create one yourself.
This is where dockerfiles come into play. Using these you can create a new image from which you can create as many containers as you like.
Small sidenote:
I know this is just an example but don’t ever do this. The correct way is to create 3 separate containers and let them communicate over a docker network.
So if I understand correctly, the main idea of Docker files is to use them when you have to do multiple steps to achieve your end goal, say downloading and installing something from GitHub and changing a couple of files. If you didn’t have a Docker file, each time you would want to reach the situation provided by the Docker file, you would have to manually create an Ubuntu container, download and install from GitHub manually, and change the folders manually. Simplified, one could say a Docker file is like an .exe installer for Windows, it does multiple things to give you a desired end result without having to do a lot of things manually. Is that roughly correct?
Why would this be a bad idea? If the Docker container is made from a basic Ubuntu image, why wouldn’t it be able to run multiple services, like a normal Ubuntu install?
Sort of. The workflow is as follows Dockerfile 1->1 Image 1->n Container
To create a container you need an image to run in the container and those images must be built using dockerfiles.
For many problems there are already images available on hub.docker.com but those images were built using dockerfiles aswell.
Without dockerfiles there are no images.
This is defnitly possibile but it’s some kind of antipattern in docker. In docker one container should have excactly one job. Docker really shines when using microservices instead of one big chunky server. So instead of having your db, webserver, api and email server in one container, you split them into separate containers. One for the db, one for the webserver, etc.
This way if one container dies the other containers are still alive different to the traditional big chunky server, which hosts all of your services. If that server goes down your whole infrastructure is gone instead of just one part of it.
That makes sense, but is it always worth it?
Say I have a basic website (personal use only), hosted on Apache, and NodeJS for the backend. If Apache goes down, NodeJS can’t do anything, and if NodeJS goes down, all Apache could do is show that it’s not working, so there is some benefit in splitting them up, but for my use, it’s not massive. Therefor, I’m interested in the performance implications. How much of a difference would it make to run Apache and NodeJS together on say Alpine, compared to running them both separately? If running them separately takes almost twice as much resources (which I’m not expecting to be clear, but I’m not sure), it doesn’t make sense for me to split them up, since I’m barely gaining anything by doing it.
If you are just tinkering with docker for your little home setup you can basically abuse docker and commit every sin possible. It doesn’t really matter.
These docker best practices only really come in handy if you use docker on a large scale.
One advantage of the microservice infrastructure is that you can use the official images for nodejs and apache and don’t need to create a big image yourself.
But do what you prefere.