*Database Replication

September 15, 2025

Why replicate?

Availability (failover), read scaling, geo‑proximity, backups. Replication comes with lag and consistency trade‑offs.

Models

  • Leader‑Replica: writes to leader; async or sync replication to replicas for reads.
  • Multi‑leader: writes in multiple regions; conflict resolution required.
  • Leaderless: quorum reads/writes (e.g., Dynamo‑style).

Lag and read‑your‑writes

  • Async replication → replicas may be behind. For read‑your‑writes, route a user to the leader for a window or use session tokens.

Failover

  • Election/consensus (RAFT/Paxos) in managed systems; or promote a replica. Test failover regularly.

Code: Postgres hot standby

# language-bash
wal_level=replica
archive_mode=on
archive_command='cp %p /mnt/wal/%f'
hot_standby=on
-- language-sql
-- On replica: allow reads
ALTER SYSTEM SET hot_standby = on;
SELECT pg_is_in_recovery();

Route reads with lag awareness

// language-typescript
async function readUser(id) {
  const replica = pickReplicaByLag(<10 /* ms */>) || leader
  return replica.query('SELECT * FROM users WHERE id = $1', [id])
}

Conflict resolution (multi‑leader)

  • Last‑write‑wins (timestamps) is simple but lossy.
  • Operation‑based (CRDTs) or merge functions per entity reduce data loss.

Backups ≠ replication

Replication can propagate corruption. Keep offsite snapshots and test restores.

Analogy

Replication is like photocopying a document for multiple offices. Copies may be slightly behind; during a printer outage, people use the nearest copy, then reconcile with the master later.

FAQ

  • Can I do strong reads from replicas? Only with synchronous replication or majority quorums.
  • How do I avoid stale reads? Route session reads to the leader or use read my writes markers.

Try it

Measure replica lag under peak writes; enforce a policy to avoid routing hot sessions to laggy replicas.

Modèles

  • Master-Replica: écritures sur primaire, lectures sur secondaires.
  • Multi-leader: écritures multi-régions, résolution de conflits nécessaire.
  • Leaderless (quorums): Dynamo-like, consistance réglable.

Points clés

  • Lag: réplication asynchrone → lectures potentiellement obsolètes.
  • Read-your-writes: routez temporairement vers le primaire ou utilisez sessions.
  • Failover: promotion automatique, RAFT/Paxos pour consensus.

Bonnes pratiques

  • Mesurer le replica lag; politiques de routage conscientes du lag.
  • Backups indépendants de la réplication.