PostgreSQL Replication
ProxPanel’s HA cluster uses PostgreSQL streaming replication to keep a hot standby in sync with the primary. This page covers the database-level mechanics — the wal_level, replication slots, pg_basebackup, pg_hba.conf, and how to monitor lag. For the panel-level cluster UI and failover flow, see HA Cluster and Failover.
The replication setup is automated by the panel when you click Configure as Main Server and Join as Secondary. This page is for understanding what happened, debugging when it didn’t, and running it by hand if you need to.
What streaming replication is
Section titled “What streaming replication is”The main server’s Postgres writes a Write-Ahead Log (WAL) for every change. With streaming replication, the secondary’s Postgres opens a TCP connection to the main, asks for “give me every WAL record from LSN X onward”, and applies them locally as they arrive. The secondary’s data is identical to the main’s, lagging by however long it takes the network to deliver and the replica to replay.
┌─────────────────────┐ ┌──────────────────────┐ │ MAIN proxpanel-db │ ── port 5432 ──────▶ │ SECONDARY proxpanel-db│ │ │ (replicator user) │ │ │ pg_wal/ │ │ standby.signal │ │ └─ 000…001 │ WAL stream │ pg_wal/ │ │ └─ 000…002 │ ────────────────▶ │ └─ 000…001 │ │ └─ 000…003 (live) │ via replication │ └─ 000…002 │ │ │ slot replica_2 │ └─ 000…003 (applying) │ └─────────────────────┘ └──────────────────────┘The replication slot ensures the main doesn’t recycle WAL segments the secondary hasn’t consumed yet, so a brief network blip doesn’t force a full re-base.
Postgres settings on the main
Section titled “Postgres settings on the main”services/postgres_replication.go SetupMainServer() runs:
ALTER SYSTEM SET wal_level = replica;ALTER SYSTEM SET max_wal_senders = 10;ALTER SYSTEM SET max_replication_slots = 10;ALTER SYSTEM SET wal_keep_size = '1GB';ALTER SYSTEM SET hot_standby = on;ALTER SYSTEM SET listen_addresses = '*';SELECT pg_reload_conf();| Setting | Why |
|---|---|
wal_level = replica | Generate enough WAL detail for physical replication. |
max_wal_senders = 10 | Allow up to 10 simultaneous replicas + base-backups. |
max_replication_slots = 10 | One slot per replica. |
wal_keep_size = '1GB' | Retain 1 GB of WAL on disk in case a slow replica falls behind. |
hot_standby = on | Allow read queries on the replica while it’s streaming. |
listen_addresses = '*' | Without this, Postgres binds only to localhost and the replica can’t connect. |
These are set with ALTER SYSTEM which writes to postgresql.auto.conf and survives container restarts. The proxpanel-db Postgres data directory is a Docker volume, so this state is persistent.
The replicator role
Section titled “The replicator role”CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD '<random>';The password is whatever was set as DB_PASSWORD on the main. Same password, separate role, narrow privileges (REPLICATION only — no DB access).
pg_hba.conf
Section titled “pg_hba.conf”pg_hba.conf controls who can connect over the network. Out of the box, Postgres allows nothing from outside the container. You must add:
host replication replicator <secondary_ip>/32 md5The panel logs this exact line on SetupMainServer() but does not modify pg_hba.conf automatically — that file lives inside the Postgres data volume and editing it via the container is the safest path.
-
Identify the secondary’s IP (the IP the secondary will connect from, not the main’s IP).
-
On the main, append the line:
Terminal window docker exec proxpanel-db bash -c \"echo 'host replication replicator <SECONDARY_IP>/32 md5' >> /var/lib/postgresql/data/pg_hba.conf" -
Reload Postgres (no restart needed):
Terminal window docker exec proxpanel-db psql -U proxpanel -d proxpanel -c "SELECT pg_reload_conf();" -
Verify:
Terminal window docker exec proxpanel-db cat /var/lib/postgresql/data/pg_hba.conf | grep replication
If you skip this, the secondary’s pg_basebackup will hang with FATAL: no pg_hba.conf entry for replication connection.
Replication slots
Section titled “Replication slots”SELECT pg_create_physical_replication_slot('replica_2');Slots are created on the main, one per secondary. The slot name is replica_<node_id> where node_id is the auto-incrementing ID in cluster_nodes.
The slot guarantees WAL retention. If the secondary is offline for 4 hours, the main holds 4 hours of WAL on disk (subject to max_slot_wal_keep_size if set — by default unbounded). When the secondary reconnects, replication resumes from where it left off.
Setting up the replica
Section titled “Setting up the replica”SetupReplicaServer() generates a setup script at /tmp/setup_replica.sh rather than running it directly — stopping Postgres while the API container is still running would break the live DB connection. You execute the script manually.
The script does:
docker stop proxpanel-dbdocker run --rm -v proxpanel_postgres_data:/data -v /tmp:/backup alpine \ tar -czf /backup/postgres_backup_TIMESTAMP.tar.gz -C /data .docker run --rm -v proxpanel_postgres_data:/data alpine \ sh -c "rm -rf /data/*"docker run --rm \ -v proxpanel_postgres_data:/var/lib/postgresql/data \ -e PGPASSWORD='<replicator_password>' postgres:16 \ pg_basebackup -h MAIN_IP -p 5432 -U replicator \ -D /var/lib/postgresql/data -Fp -Xs -P -R -S replica_2docker run --rm -v proxpanel_postgres_data:/data alpine touch /data/standby.signaldocker start proxpanel-db| Flag | Meaning |
|---|---|
-Fp | Plain format (not tar) — output to a directory. |
-Xs | Stream WAL in parallel during base backup. |
-P | Show progress. |
-R | Write primary_conninfo to postgresql.auto.conf and create standby.signal. |
-S replica_2 | Use replication slot named replica_2. |
After this, Postgres starts in standby mode. The standby.signal empty file is the marker; if you delete it and restart, Postgres exits recovery and becomes writable (this is exactly what pg_promote() does).
Verifying the replica is streaming
Section titled “Verifying the replica is streaming”docker exec proxpanel-db psql -U proxpanel -d proxpanel -c \ "SELECT pg_is_in_recovery();"# → t (true) — this is a replica
docker exec proxpanel-db psql -U proxpanel -d proxpanel -c \ "SELECT * FROM pg_stat_wal_receiver \\gx"# → status: streaming# sender_host: <main_ip># slot_name: replica_2# last_msg_receipt_time: 2026-05-12 14:05:30+00Monitoring lag
Section titled “Monitoring lag”From the main
Section titled “From the main”SELECT application_name, client_addr, state, sync_state, pg_wal_lsn_diff(sent_lsn, replay_lsn) AS lag_bytes FROM pg_stat_replication;lag_bytes is how many bytes of WAL the replica hasn’t replayed yet. Under ~1 MB is healthy. Sustained tens or hundreds of megabytes means the replica is overwhelmed or the network is choking.
From the replica
Section titled “From the replica”SELECT now() - pg_last_xact_replay_timestamp() AS replay_lag;This is the time delta — typically sub-second, may climb to seconds under load.
The cluster service uses this exact query (GetReplicationLagSeconds()) and reports it in the heartbeat. The Cluster tab in the panel shows the result.
Promoting the replica
Section titled “Promoting the replica”When the secondary needs to become the new main (planned switchover or automatic failover), the panel calls:
SELECT pg_promote();This:
- Replays any pending WAL.
- Removes
standby.signal. - Exits recovery mode.
- Begins accepting writes.
Takes 1–5 seconds typically. The connection from the application (proxpanel-api) usually doesn’t even drop; the next write succeeds.
Old primary cannot just be re-attached as a new secondary — its WAL diverged from the new primary’s at the moment of promotion. You must run pg_basebackup again from the new primary to re-base it. The panel’s DemoteToReplica() generates a script for this; see Failover → Re-attaching the old main for the full flow.
CLI cheatsheet
Section titled “CLI cheatsheet”# Status (from main)docker exec proxpanel-db psql -U proxpanel -d proxpanel -c \ "SELECT * FROM pg_stat_replication;"
# Status (from replica)docker exec proxpanel-db psql -U proxpanel -d proxpanel -c \ "SELECT * FROM pg_stat_wal_receiver;"
# Replication slots on maindocker exec proxpanel-db psql -U proxpanel -d proxpanel -c \ "SELECT slot_name, active, wal_status FROM pg_replication_slots;"
# Is this a replica?docker exec proxpanel-db psql -U proxpanel -d proxpanel -c \ "SELECT pg_is_in_recovery();"
# Force a WAL switch on main (useful during testing — forces the replica to# receive a new segment)docker exec proxpanel-db psql -U proxpanel -d proxpanel -c \ "SELECT pg_switch_wal();"
# Manually promote (last-resort, normally you'd use the panel UI)docker exec proxpanel-db psql -U proxpanel -d proxpanel -c \ "SELECT pg_promote();"Common pitfalls
Section titled “Common pitfalls”FATAL: no pg_hba.conf entry for replication connection. Add thehost replication replicator <ip>/32 md5line. Don’t forget thepg_reload_conf().pg_basebackup: connection refused.listen_addressesnot set to*on main, or the Postgres port 5432 isn’t reachable from the secondary. The install script binds 5432 to127.0.0.1only — cluster setup adds an additional bind to the cluster network. Confirmdocker port proxpanel-db.- Replica catches up, then falls behind, then catches up. Long-running write transaction on main (mass FUP reset, bulk subscriber import). Wait it out — the slot ensures no WAL is lost.
wal_status = loston a slot. The main ran out ofwal_keep_sizeand recycled WAL the replica hadn’t yet consumed. You mustpg_basebackupagain. Setwal_keep_sizehigher or use a paid backup service that captures WAL externally.- Postgres won’t start after rebase. Permissions. The
pg_basebackupcommand runs as root inside an alpine container; Postgres won’t open a data directory it doesn’t own. The setup script ends withchown -R 999:999 /data— if you ran it by hand and skipped that, the container exits withpermission denied. - Replica is read-only but the panel UI is showing live updates. That’s the heartbeat coming in over the API layer — heartbeats write to
cluster_nodeson the main, which is then replicated back. The secondary’s UI dashboards refresh from its own (read-only) DB just fine.
Permissions
Section titled “Permissions”Running these commands requires shell access on the host (root or docker group). Inside the panel UI, the cluster setup actions are admin-only.
Related pages
Section titled “Related pages”- HA Cluster — the panel-level wrapping of this replication.
- Failover —
pg_promote()+ Redis + DNS in one workflow. - Backups & Recovery — replication does not protect against
DROP TABLE; backups do.