Confused about Docker: images reverting

I’m confused about Docker to be honest. I have read a lot of the documentation but I don’t understand what’s going on with my image.

I pulled an Ubuntu image to my Mac, and executed docker run -i -t on it. Hooray! I’m in my image.

So I installed nodejs, npm, and the latest version of Python. Great!

But then — I leave the image and go back to the host terminal. I docker run it again, and find that all the changes I made have disappeared.

This reveals a fundamental ignorance of the nature and functions of Docker. Can I not make changes to my image and save it?

Not quite. You’re in a container, that’s merely based on the image. And as you’ve noticed, when you exit the container, all of the content in it is destroyed. (And if you start a second copy of the container at the same time, their two filesystems are separate.)

You should expect this to happen routinely and plan for it.

Best practice, by far, is to learn about the docker build command. This will let you write a Dockerfile (essentially a glorified shell script) that describes how to build the image. Then you can check that file into source control. If there’s any startup-time setup your container needs to do when it starts, you can bundle a shell script into the image.

If you have a long-running container with persistent state (like a database) then you can use Docker volumes to hold the data across container executions, or use the docker run -v option to store that data in a host-system directory.

Now when you need to move this setup to another system, or something goes wrong and you need to rm -rf /var/lib/docker, or worse, all you need to do to recreate the setup is to clone your source repository, docker build the image again, and go.

To reiterate: things go wrong, and there are some Docker options that can only be set when you initially docker run a container. Plan up front for the container to be restarted from a clean filesystem.

(Oh yeah, there’s docker commit too, but there are things like the default CMD that are hard to set that way, and this tends to lead to “magic images” that can’t be recreated if some part of the system fails.)

Oh my Word. Thank you for your comprehensive explanation I have around half a dozen new technologies at the moment, and I thought Docker was going to be the most straightforward!

So, let me check my understanding. I pull an Ubuntu image from a public repo, and running docker run on the image creates a container based on the image. (Is the container merely a copy of the image?) Making changes to the container is as wasteful as decorating a house that is slowly falling into the sea. They will vanish when you exit the container.

So, docker build is the way forward. Gotya. I’m on it.

Thanks! M

All correct.

(At a technical level, the container is a copy-on-write layer over the image, which actually means that starting it is fast [the system doesn’t really create a complete copy of it] and that files you don’t modify don’t take up additional space. But that’s an optimization.)

Great, yes I have now successfully understood and applied the eloquent and perspicuous presentation you gave.

In a word: thanks.

Docker is all working great for me; now I have to try and figure out why Python is unhappy wit boto3!

M