PostgreSQL Replication
Summary table
Replication type | Synchronous | Asynchronous |
---|---|---|
Writing transaction | Send to reader, reader confirms, then transaction ends | Transaction ends, then send to reader |
Writing transaction latency | Higher | Unaffected |
High load on reader | Increases writing transaction latency on writer | Doesn't influence writer |
Reader consistency | ACID | Eventual consistency |
When to use reader | All read-only queries | Only those read-only queries that allow for eventual consistency |
PostgreSQL documentation | Synchronous replication | Hot standby; replication |
RMB documentation | Read and write database instances setup | to do |
AWS and reader queries | not supported | supported |
Introduction
A PostgreSQL instance can be replicated so that there exists one write instance (also called the primary) and one ore more read instances (called the replicas).
This can be used for high availability: If the writer fails, simply make one of the readers the new writer.
Replication can also been used for horizontal scaling: Read-only queries can be run on the reader instances to reduce the load on the writer. Usually more than 90 % of the load is from read-only queries.
This document discusses replication for read-only queries.
Writing transaction
If a transaction on the writer doesn't change any data, no replication is needed. This may happen if no record is affected, or if the transaction rolls back.
If data is changed, the change needs to be replicated to the readers.
For synchronous replication the writer sends the data to the readers, then the writer waits until all readers have replied with a confirmation that they have received the new data, and then ends the transaction and only then notifies the SQL client that the SQL transaction is completed.
For asynchronous replication the SQL client is immediately notified that the SQL transaction is completed, and then sends the data to the readers. The readers don't send any confirmation to the writer.
Writing transaction latency
For synchronous replication all data changing transactions have higher latency when adding replicas because of writer needs to wait until all readers have received the new data. This is required to ensure ACID.
Asynchronous replication doesn't increase transaction latency.
High load on reader
If there's high load on a reader and synchronous replication is used this delays the confirmation messages for the writer, this increases the latency of writer transactions.
For asynchronous replication high load on readers won't influence the writer.
Reader consistency
Synchronous readers are included into writer transactions and therefore can ensure ACID consistency.
Asynchronous readers are not included into writer transactions. It always takes some time until new data is available on the readers. This time lag increases when there is high load on the reader and/or on the writer. The readers only provide eventual consistency.
When to use reader
If synchronous replication is used, all read-only queries can be run on the readers. ACID consistency of the readers ensure that the result from a reader is exactly the same as from the writer.
Developers can send all read-only queries to the readers. To ensure that write queries are not sent to readers where they fail RMB provides a read-write-split testing setup for unit tests.
If asynchronous replication is used, read-only queries on the readers may return outdated data because of eventual consistency. If latest data is needed the read-only query must be sent to the writer (or to a synchronous replica). Most APIs including CRUD APIs are designed for ACID consistency and therefore cannot use asynchronous replication.
Use cases for asynchronous replication read-only queries:
- Regular timer based tasks (cronjobs)
- Examples: Find records to disable expired patron account, to delete expired request, to send reminder for overdue loan
- If data is not available on the current execution it will be processed during the next execution
- Developers must anticipate that the returned records might be outdated.
- Therefore use the async reader only for finding records to process
- Use the writer (or a sync reader) to confirm that the record needs to be processed
- And use the writer to update the record.
- Export to discovery system
- Other bulk export