We steps will reproduce the bug?
I have 3 physical PC’s Configuration details
- Ram - 32 GB
- Hard Disk - 2 TB
- CPU Core - 8 core
- OS - Ubuntu server OS 20.04 LTS
Docker swarm setup details
- PC1 - Manager node
- PC2 - Worker node(primary database)
- PC3 - Worker node(secondary database)
My Problem is postgresql replication and failover setup, In Docker swarm setup if primary nodes goes down, secondary node act as primary, then primary node comes normal state(ON) that time existing docker volume data reinitialize (i.e. existing data removed then primary node data syncing).
version: '3.8'
services:
pg-0:
image: bitnami/postgresql-repmgr:14
ports:
- 5432
volumes:
- pg_0_data:/bitnami/postgresql
environment:
- POSTGRESQL_POSTGRES_PASSWORD=adminpassword
- POSTGRESQL_USERNAME=customuser
- POSTGRESQL_PASSWORD=custompassword
- POSTGRESQL_DATABASE=customdatabase
- POSTGRESQL_NUM_SYNCHRONOUS_REPLICAS=1
- REPMGR_PRIMARY_HOST=pg-0
- REPMGR_PARTNER_NODES=pg-1,pg-0
- REPMGR_NODE_NAME=pg-0
- REPMGR_NODE_NETWORK_NAME=pg-0
- REPMGR_USERNAME=repmgr
- REPMGR_PASSWORD=repmgrpassword
- REPMGR_CONNECT_TIMEOUT=2
- REPMGR_RECONNECT_ATTEMPTS=1
- REPMGR_RECONNECT_INTERVAL=2
- REPMGR_MASTER_RESPONSE_TIMEOUT=5
deploy:
placement:
constraints: [node.labels.type == node1]
pg-1:
image: bitnami/postgresql-repmgr:14
ports:
- 5432
volumes:
- pg_1_data:/bitnami/postgresql
environment:
- POSTGRESQL_POSTGRES_PASSWORD=adminpassword
- POSTGRESQL_USERNAME=customuser
- POSTGRESQL_PASSWORD=custompassword
- POSTGRESQL_DATABASE=customdatabase
- POSTGRESQL_NUM_SYNCHRONOUS_REPLICAS=1
- REPMGR_PRIMARY_HOST=pg-0
- REPMGR_PARTNER_NODES=pg-0,pg-1
- REPMGR_NODE_NAME=pg-1
- REPMGR_NODE_NETWORK_NAME=pg-1
- REPMGR_USERNAME=repmgr
- REPMGR_PASSWORD=repmgrpassword
- REPMGR_CONNECT_TIMEOUT=2
- REPMGR_RECONNECT_ATTEMPTS=1
- REPMGR_RECONNECT_INTERVAL=2
- REPMGR_MASTER_RESPONSE_TIMEOUT=5
deploy:
placement:
constraints: [node.labels.type == node2]
pgpool:
image: bitnami/pgpool:4
ports:
- 5000:5432
environment:
- PGPOOL_BACKEND_NODES=0:pg-0:5432,1:pg-1:5432
- PGPOOL_SR_CHECK_USER=repmgr
- PGPOOL_SR_CHECK_PASSWORD=repmgrpassword
- PGPOOL_SR_CHECK_PERIOD=5
- PGPOOL_HEALTH_CHECK_PERIOD=3
- PGPOOL_HEALTH_CHECK_RETRY_DELAY=2
- PGPOOL_HEALTH_CHECK_MAX_RETRIES=1
- PGPOOL_HEALTH_CHECK_TIMEOUT=2
- PGPOOL_MAX_POOL=32
- PGPOOL_ENABLE_LDAP=no
- PGPOOL_POSTGRES_USERNAME=postgres
- PGPOOL_POSTGRES_PASSWORD=adminpassword
- PGPOOL_ADMIN_USERNAME=admin
- PGPOOL_ADMIN_PASSWORD=adminpassword
- PGPOOL_ENABLE_LOAD_BALANCING=yes
- PGPOOL_POSTGRES_CUSTOM_USERS=customuser
- PGPOOL_POSTGRES_CUSTOM_PASSWORDS=custompassword
healthcheck:
test: ["CMD", "/opt/bitnami/scripts/pgpool/healthcheck.sh"]
interval: 5s
timeout: 5s
retries: 2
deploy:
placement:
constraints: [node.role == manager]
volumes:
pg_0_data:
driver: local
pg_1_data:
driver: local
My expected behavior?
I have run the above docker compose file on docker swarm in setup and we have one primary and one secondary database are running. when the primary database goes down , the secondary database becomes as primary and then failure node gets up , it acts as secondary database and the data resync from the primary database. Actually the data is fully removed automatically and the data is syncing from primary database from the beginning, not the updated data only. in this case, the updated data only need to be sync.