/
PostgreSQL Replication

PostgreSQL Replication

Summary table

Replication typeSynchronousAsynchronous
Writing transactionSend to reader, reader confirms, then transaction endsTransaction ends, then send to reader
Writing transaction latencyHigherUnaffected
High load on readerIncreases writing transaction latency on writerDoesn't influence writer
Reader consistencyACIDEventual consistency
When to use readerAll read-only queriesOnly those read-only queries that allow for eventual consistency
PostgreSQL documentationSynchronous replicationHot standby; replication
RMB documentationRead and write database instances setupto do
AWS and reader queriesnot supportedsupported

Introduction

A PostgreSQL instance can be replicated so that there exists one write instance (also called the primary) and one ore more read instances (called the replicas).

This can be used for high availability: If the writer fails, simply make one of the readers the new writer.

Replication can also been used for horizontal scaling: Read-only queries can be run on the reader instances to reduce the load on the writer. Usually more than 90 % of the load is from read-only queries.

This document discusses replication for read-only queries.

Writing transaction

If a transaction on the writer doesn't change any data, no replication is needed. This may happen if no record is affected, or if the transaction rolls back.

If data is changed, the change needs to be replicated to the readers.

For synchronous replication the writer sends the data to the readers, then the writer waits until all readers have replied with a confirmation that they have received the new data, and then ends the transaction and only then notifies the SQL client that the SQL transaction is completed.

For asynchronous replication the SQL client is immediately notified that the SQL transaction is completed, and then sends the data to the readers. The readers don't send any confirmation to the writer.

Writing transaction latency

For synchronous replication all data changing transactions have higher latency when adding replicas because of writer needs to wait until all readers have received the new data. This is required to ensure ACID.

Asynchronous replication doesn't increase transaction latency.

High load on reader

If there's high load on a reader and synchronous replication is used this delays the confirmation messages for the writer, this increases the latency of writer transactions.

For asynchronous replication high load on readers won't influence the writer.

Reader consistency

Synchronous readers are included into writer transactions and therefore can ensure ACID consistency.

Asynchronous readers are not included into writer transactions. It always takes some time until new data is available on the readers. This time lag increases when there is high load on the reader and/or on the writer. The readers only provide eventual consistency.

When to use reader

If synchronous replication is used, all read-only queries can be run on the readers. ACID consistency of the readers ensure that the result from a reader is exactly the same as from the writer.

Developers can send all read-only queries to the readers. To ensure that write queries are not sent to readers where they fail RMB provides a read-write-split testing setup for unit tests.

If asynchronous replication is used, read-only queries on the readers may return outdated data because of eventual consistency. If latest data is needed the read-only query must be sent to the writer (or to a synchronous replica). Most APIs including CRUD APIs are designed for ACID consistency and therefore cannot use asynchronous replication.

Use cases for asynchronous replication read-only queries:

  • Regular timer based tasks (cronjobs)
    • Examples: Find records to disable expired patron account, to delete expired request, to send reminder for overdue loan
    • If data is not available on the current execution it will be processed during the next execution
    • Developers must anticipate that the returned records might be outdated.
    • Therefore use the async reader only for finding records to process
    • Use the writer (or a sync reader) to confirm that the record needs to be processed
    • And use the writer to update the record.
  • Export to discovery system
  • Other bulk export