I have a large postgres database that i’m building into a docker container.
Right now, I’m downloading a compressed databaes dump and restoring it during the docker build process. So it takes a 1.7 gb compressed file and generates a docker image of about 40GB.
The down side of this is of course deployment of the docker image or pulling it from dtr is super slow.
I was wondering if anyone had any suggestions on how to better manage the volume data.
This is my current docker file.
FROM internal_db:base COPY *.Fc /tmp/sql/ COPY demoinit.sh /tmp/sql/ RUN /usr/pgsql-10/bin/pg_ctl -D /postgresql/pg_data/10/ start && \ /tmp/sql/demoinit.sh /tmp/sql && \ /usr/pgsql-10/bin/pg_ctl -D /postgresql/pg_data/10/ stop USER root RUN /usr/bin/rm -Rf /tmp/sql USER postgres
the script it calls is this does a few sanity checks then ends up running this line.
pg_restore -p 5432 --dbname=postgres -Fc --create --verbose --jobs=4 --no-tablespaces /tmp/sql/akdb-extract-for-docker.Fc which does most of the work.
On the one hand I like the fact the data is in a container because i can take advantage of the image reset to restore the DB data to an original state, but at the same time having a container that large seems ilke an anti-pattern.
Anyone have a better suggestion on how to do this? (This also cause serious issues in the past with docker VMs running out fo space / memory and so on while building the image)