PostgreSQL Managed Backups are failing with the message: ERROR: canceling statement due to conflict with recovery

Why this Happens

The most common cause of this issue is that your backups are running on a PostgreSQL Replica Server during periods of heavy application load.

This results in statements running on the replica server that are canceled if they conflict with what has happened on the primary database.

You may see an error message like:

PostgreSQL replica is taking a backup of the "users" table, but the primary table has just run "DROP TABLE users;"

or

Dumping the contents of table "users" failed: PQgetResult() failed. pg_dump: Error message from server: FATAL: terminating connection due to administrator command

This kind of failure may appear to be occurring at random as it will only appear when a backup is taken at the same time the server is under load.

How Should I Resolve this Issue?

There are two potential ways to solve this issue:

1) Run the Managed backup on the Master server

Navigate to the Managed Backups Edit page and make sure that the Run on replication slave server if available? option is unchecked

Generally, backups only take a couple of minutes to run, and shouldn’t adversely affect application performance.

2) Change the “max_standby_streaming_delay” setting in your PostgreSQL configuration

The “max_standby_streaming_delay” option in the PostgreSQL configuration file defines how long the replica can wait until it cancels conflicting queued queries. You can edit this by navigating to your stack Configuration → Configuration Files in the Cloud 66 dashboard.

The default setting is 30 seconds. You can increase this value up to the point where the backups succeed. Try increasing the value to 120 seconds and then increasing/decreasing experimentally.

Unfortunately, the correct value is not something that can be automatically determined as it depends on your data.