Docker PostgreSQL Replication and Failover setup data syncing issue

We steps will reproduce the bug?

I have 3 physical PC’s Configuration details

  • Ram - 32 GB
  • Hard Disk - 2 TB
  • CPU Core - 8 core
  • OS - Ubuntu server OS 20.04 LTS

Docker swarm setup details

  1. PC1 - Manager node
  2. PC2 - Worker node(primary database)
  3. PC3 - Worker node(secondary database)

My Problem is postgresql replication and failover setup, In Docker swarm setup if primary nodes goes down, secondary node act as primary, then primary node comes normal state(ON) that time existing docker volume data reinitialize (i.e. existing data removed then primary node data syncing).

version: '3.8'
services:
  pg-0:
    image: bitnami/postgresql-repmgr:14
    ports:
      - 5432
    volumes:
      - pg_0_data:/bitnami/postgresql
    environment:
      - POSTGRESQL_POSTGRES_PASSWORD=adminpassword
      - POSTGRESQL_USERNAME=customuser
      - POSTGRESQL_PASSWORD=custompassword
      - POSTGRESQL_DATABASE=customdatabase
      - POSTGRESQL_NUM_SYNCHRONOUS_REPLICAS=1
      - REPMGR_PRIMARY_HOST=pg-0
      - REPMGR_PARTNER_NODES=pg-1,pg-0
      - REPMGR_NODE_NAME=pg-0
      - REPMGR_NODE_NETWORK_NAME=pg-0
      - REPMGR_USERNAME=repmgr
      - REPMGR_PASSWORD=repmgrpassword
      - REPMGR_CONNECT_TIMEOUT=2
      - REPMGR_RECONNECT_ATTEMPTS=1
      - REPMGR_RECONNECT_INTERVAL=2
      - REPMGR_MASTER_RESPONSE_TIMEOUT=5
    deploy:
      placement:
        constraints: [node.labels.type == node1]
  pg-1:
    image: bitnami/postgresql-repmgr:14
    ports:
      - 5432
    volumes:
      - pg_1_data:/bitnami/postgresql
    environment:
      - POSTGRESQL_POSTGRES_PASSWORD=adminpassword
      - POSTGRESQL_USERNAME=customuser
      - POSTGRESQL_PASSWORD=custompassword
      - POSTGRESQL_DATABASE=customdatabase
      - POSTGRESQL_NUM_SYNCHRONOUS_REPLICAS=1
      - REPMGR_PRIMARY_HOST=pg-0
      - REPMGR_PARTNER_NODES=pg-0,pg-1
      - REPMGR_NODE_NAME=pg-1
      - REPMGR_NODE_NETWORK_NAME=pg-1
      - REPMGR_USERNAME=repmgr
      - REPMGR_PASSWORD=repmgrpassword
      - REPMGR_CONNECT_TIMEOUT=2
      - REPMGR_RECONNECT_ATTEMPTS=1
      - REPMGR_RECONNECT_INTERVAL=2
      - REPMGR_MASTER_RESPONSE_TIMEOUT=5
    deploy:
      placement:
        constraints: [node.labels.type == node2]


pgpool:
    image: bitnami/pgpool:4
    ports:
      - 5000:5432
    environment:
      - PGPOOL_BACKEND_NODES=0:pg-0:5432,1:pg-1:5432
      - PGPOOL_SR_CHECK_USER=repmgr
      - PGPOOL_SR_CHECK_PASSWORD=repmgrpassword
      - PGPOOL_SR_CHECK_PERIOD=5
      - PGPOOL_HEALTH_CHECK_PERIOD=3
      - PGPOOL_HEALTH_CHECK_RETRY_DELAY=2
      - PGPOOL_HEALTH_CHECK_MAX_RETRIES=1
      - PGPOOL_HEALTH_CHECK_TIMEOUT=2
      - PGPOOL_MAX_POOL=32
      - PGPOOL_ENABLE_LDAP=no
      - PGPOOL_POSTGRES_USERNAME=postgres
      - PGPOOL_POSTGRES_PASSWORD=adminpassword
      - PGPOOL_ADMIN_USERNAME=admin
      - PGPOOL_ADMIN_PASSWORD=adminpassword
      - PGPOOL_ENABLE_LOAD_BALANCING=yes
      - PGPOOL_POSTGRES_CUSTOM_USERS=customuser
      - PGPOOL_POSTGRES_CUSTOM_PASSWORDS=custompassword
    healthcheck:
      test: ["CMD", "/opt/bitnami/scripts/pgpool/healthcheck.sh"]
      interval: 5s
      timeout: 5s
      retries: 2
    deploy:
      placement:
        constraints: [node.role == manager]
volumes:
  pg_0_data:
    driver: local
  pg_1_data:
    driver: local

My expected behavior?

I have run the above docker compose file on docker swarm in setup and we have one primary and one secondary database are running. when the primary database goes down , the secondary database becomes as primary and then failure node gets up , it acts as secondary database and the data resync from the primary database. Actually the data is fully removed automatically and the n data is syncing from primary database from the beginning, not the updated data only. in this case, the updated data only need to be sync.

Did you compare parameters with the official example?

Note that Docker Swarm is only HA with 3 managers, which still can do worker duties.