Docker database disappears randomly

Migrated an application to docker consisting mainly of a webapp container and a postgres container, the latter using the official image postgres:17.6. Snippet from compose.yaml:

  dbservice:
    container_name: pg
    image: postgres:17.6
    restart: always
    ...
    volumes:
      - 'pgdata:/var/lib/postgresql/data'
      - './db-init/10-init.sql:/docker-entrypoint-initdb.d/10-init.sql'
volumes:
  pgdata:

Worked great in testing so I deploy to production yesterday and initially it works, and I can up and down the system with data being persisted.

Then after an hour the data written to database disappears and the webapp starts logging that it cant connect to the database instance (not the database service as such, which is still running). I “docker exec -it pg /bin/bash” into the container and check with “psql -U postgres”, and indeed only the default database is there everything else created by the init script and at runtime is gone.

So I “docker compose down”, remove the volume, then “docker compose up -d” and the volume is recreated, the database instance is recreated by the 10-init.sql script and everything is fine.

This morning, same problem in production, it has been down for a long time. No clue in the logs why it happened. So now Im scared sh**less because I realize I put a new app architecture in production while being a total noob at docker.

The official postgres docs has this comment:

Important Note: (for PostgreSQL 17 and below) Mount the data volume at /var/lib/postgresql/data and not at /var/lib/postgresql because mounts at the latter path WILL NOT PERSIST database data when the container is re-created. The Dockerfile that builds the image declares a volume at /var/lib/postgresql/data and if no data volume is mounted at that path then the container runtime will automatically create an anonymous volume⁠ that is not reused across container re-creations. Data will be written to the anonymous volume rather than your intended data volume and won’t persist when the container is deleted and re-created.

Given the use of postgres 17 I would think the compose.yaml is in compliance by using var/lib/postgresql/data?

When I look inside the docker file of the 17.6 postgres image it also declares the volume like this:

VOLUME /var/lib/postgresql/data

Also the data is lost before recreating the container. Am I doing something obviously wrong?

UPDATE:

Checking “docker volume ls”

DRIVER VOLUME NAME
local 78b720ec7d7b1ab97b15588606ded350d3afd88a0bc221b30ec2c09c0c921756
local 79c1d1f219e46426b3cb1c898646815d8681614939cf22844678888eb31f61cf
local myapp_pgdata

One anonymous vol could be removed with “docker volume rm ..”, the other is in use by another container. The correctly named “myapp_pgdata” has lots of data as expected.

The first two are indeed anonymous volumes. The are created if the Dockerfile declares a path as VOLUME, but neither a volume or bind is mapped against this container folder.

As long as your compose project has the same name (either because you specified it, or the folder name where the compose file is located is used as fallback), the container should be able to see and use the right volume.

Did you happen to upgrade the tag from 17.x to 17.6 when the problem occurred? Postgres requires a data export/import when a version upgrade is done.

The cause might be Ubuntu’s unattended upgrades. I tried disabling it yesterday, and production has been running without issue since. But if that is the case then Id still like to understand why this type of shutdown is such a severe failure mode for postgres in docker. The same postgres database has been running on Ubuntu with unattended upgrades but without docker for more than a decade with 100% reliability.

I also disabled APT daily upgrade just in case.

systemctl stop unattended-upgrades 
apt-get purge unattended-upgrades 

systemctl stop apt-daily-upgrade.timer
systemctl disable apt-daily-upgrade.timer
systemctl daemon-reload

Also I should mention that it appears the actual pgdata volume is not damaged by the shutdown. But when I attempt to restart the docker container it spews out hundreds of errors relating to not finding stuff in the image, here’s a small sample:

2025-08-28 11:33:36.350 UTC [1] LOG: starting PostgreSQL 17.6 (Debian 17.6-1.pgdg13+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 14.2.0-19) 14.2.0, 64-bit
| 2025-08-28 11:33:36.351 UTC [1] LOG: listening on IPv4 address “0.0.0.0”, port 5432
| 2025-08-28 11:33:36.351 UTC [1] LOG: listening on IPv6 address “::”, port 5432
| 2025-08-28 11:33:36.353 UTC [1] LOG: listening on Unix socket “/var/run/postgresql/.s.PGSQL.5432”
| 2025-08-28 11:33:36.360 UTC [28] LOG: database system was shut down at 2025-08-28 11:33:19 UTC
| 2025-08-28 11:33:36.370 UTC [1] LOG: database system is ready to accept connections
| 2025-08-28 11:33:44.403 UTC [65] FATAL: unsupported frontend protocol 16.0: server supports 3.0 to 3.0
| 2025-08-28 11:38:36.451 UTC [26] LOG: checkpoint starting: time
| 2025-08-28 11:38:36.460 UTC [26] LOG: checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.002 s, sync=0.001 s, total=0.010 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB; lsn=0/1A07AB0, redo lsn=0/1A07A58
| bash: line 5: chattr: command not found
| bash: line 7: chattr: command not found
| bash: line 8: chattr: command not found
| bash: line 9: ufw: command not found
| bash: line 10: iptables: command not found
| bash: line 11: /proc/sys/kernel/nmi_watchdog: Read-only file system
| bash: line 12: /etc/sysctl.conf: Permission denied
| bash: line 176: curl: command not found
| bash: line 176: /usr/local/bin/curl: Permission denied
| chmod: cannot access ‘/usr/local/bin/curl’: No such file or directory
| bash: line 177: /usr/local/bin/curl: No such file or directory
| bash: line 178: /usr/local/bin/curl: No such file or directory

The compose file snippet you shared, does it real show all volumes and binds defined for the dbservice?

Furthermore, does you dbservice declare overrides for entrypoint and/or command?

It almost looks like a script is running in the container, that is not meant to be run inside a container.

The real compose.yaml has the sql initialization split in 2 sql files bound in the same way shown in the snippet. The first creates a new postgres user, the second creates the database, schema name, ddl and data, all set to the new user. There are no “sh” scripts bound to init.

./db-init/10-init.sql:/docker-entrypoint-initdb.d/10-init.sql
./db-init/20-init.sql:/docker-entrypoint-initdb.d/20-init.sql

It doesn’t try to run these scripts when restarting after the shutdown because it finds the existing pgdata volume is already created.

The service declared no entrypoint nor command.

Because the log shows some alteration of postgres role I removed the exposure of port 5432 to the host. The host has ufw running that does not allow outside access to the port. Im a little suspicious of dockers own iptable rules, so just in case commented this out and also changed postgres account password:

 #ports:
 #- 5432:5432

The container has a docker network shared with the webapp container, so the exposure to host is unneeded.

UPDATE: So the script angle is getting interesting. I found this in the postgres container log which is walmare:

/tmp/kinsing

Have to look into whether dockers own iptable rules are circumventing my own ufw rules.