Understanding Docker as Backup-Restore Solution

Hello, i just got started with docker and i am planning to base my IT-Infrastructure upon Docker. (1 Develper, a few customers) I have setup Docker on my Ubuntu Server and created a Test-Container, and i love it, deployment was so easy, i have stopped my container and restarted it, and all data is here. i am happy. What i do not understand behind docker is the hardware/OS abstraction, so please correct me if i am wrong:
? Docker does not abstract any hardware like a Virtual Machine does
? Docker runs ON the guest OS and has its features
? i can backup and restore a container on the SAME Machine.
? i CANNOT restore a container on a DIFFERENT docker machine, because the Hardware and the OS under it is different.

my goal for the new infrastructure is to make backup and restore much easier, since i do not want to learn how every single solution has to be backed up and restored. i have allready lost all my systems because of this lazyness, so i wanna do better this time…

So heres the basic idea of the new Setup:

  • i run a VM on my current Server in order to abstract the hardware, so i dont get huge problems if my current server hardware R.I.Ps
  • The VM runs a ???-Linux Distribution with Docker installed. I keep a backup of the [Initial Zero-Docker-Images VM].
  • All Services are installed as Docker Containers.
  • Every day, a cronjob on the Host VM stops every container, makes a backup, and restart it. (no need for 24/7)
  • This Container-Backup can be restored on any copy of my VM Backup of [Initial Zero-Docker-Images VM]. on any hardware that is able to run the VM.

thanks for reading, Tom

My understanding of best practices works like this:

  1. You or your developers write Dockerfiles, specialized scripts to create Docker images. These are checked into source control.
  2. You or a CI system builds images from the Dockerfiles, which are stored in a registry (either Docker Hub or a private registry you control). If you lose these images, you can always rebuild them from the Dockerfiles.
  3. Your local system has a copy of the images. If you lose them, you can always “docker pull” them.
  4. You have local data volumes, that actually hold your critical data. This needs to be managed.
  5. Finally, containers run some image with some set of data volumes, and also have some local ephemeral state. If one of these fails in some way, you can just recreate it.

So in this scheme, the only things it’s critical to back up are the Dockerfiles (and related deployment artifacts, like your Docker Compose YAML file and what not), and the contents of any data volumes you need. For some things I’ve built, I’ve found it easier to rely on an external non-container database, and just have no local data at all.

Docker uses the host system’s kernel. There is in fact almost no hardware abstraction. The userspace (if any) is entirely contained within the container image, and there are prebuilt base images for most popular Linux distributions.

In principle, you could take some sort of backup of a local system’s Docker space; better setups put it on a dedicated LVM partition and you should not expect to be able to look “inside” Docker’s state directory. You can quite usefully “docker save” and “docker load” images across systems and Linux distributions. I imagine you could do the same thing with docker export/import for a container’s current content if you really wanted to.

It sounds like what you’re describing with your proposed backup/restore solution is a setup where an administrator could make untracked changes to a running container, and your system would capture that as an opaque thing. Actually recording those changes in the source control history of a Dockerfile will probably make them much easier to manage. You can test a change easily in a one-off environment: edit the Dockerfile, build the updated image, use Docker Compose to start a local set of containers replicating your environment, and make sure everything works; then commit it to source control and roll it out to production.

1 Like

Hello, Thank you for the answer, i am still on training with docker.

Maybe i should clarify what i want to do:
i am freelancer developer and last time i got stuck setting up all the servers i need, like Source Control, Project Management Tools and so on. I ran into problems because every solution needs another Tomcat Server, Apache Server, Java Runtime, Ruby runtime, phython runtime and so on. i hostet everything on 2 virtual machines, because the first ran out of ressources. after having all setup, i dit not take care about backing up the virtual machines, because i could not find a way to do this (external VM Provider). i dit not backup every single solution within the VM because that was not that easy. most solutions have different storage places like documents are stored here in filesystem, some data is stored in this database and some data in that files. i just thought: They have good hardware, that won’t break up, and than an employee of the VM-provider company deleted my VMs by mistake…

Now i just started with docker and dit a setup of a GitBucket Server for testing, and it was very easy. so i am happy with that. But i am really confused about the concept’s of Docker, especially the best practice you told me:

  • You have local data volumes, that actually hold your critical data. This needs to be managed.

Thats actually the task i want to get rid of.
I just want to install from Docker, run the Container, and the container stores the binaries, the configs and the data.
i make a backup for this container.
in case of emergency, i make a restore on another hostsystem, including all configs, all data and all binaries.
no need to have special insight of the complexity of the hosted service itself. this could rely on 10 different databases and 517 locations in the filesystem where the data is stored. i dont care - i simply backup/restore the whole infrastructure.