Testing CloudNativePG Preferred Data Durability

Posted byJeremy⋅October 5, 2025⋅Leave a comment

Filed Under cloudnativepg,cnpg,crash,data loss,database,durability,isolation,jepsen,lab,replication,synchronous,testing

This is the third post about running Jepsen against CloudNativePG. Earlier posts:

First: shout out to whoever first came up with Oracle Data Guard Protection Modes. Designing it to be explained as a choice between performance, availability and protection was a great idea.

Yesterday’s blog post described how the core of all data safety iscopies of the data, and the importance of efficient architectures to meet data safety requirements.

With Postgres, three-node clusters ensure the highest level of availability if one host fails. But two-node clusters are often worth the cost savings in exchange for a few seconds of unavailability during cluster reconfigurations. Similar to Oracle, Postgres two-node clusters can be configured to maximize performance or availability or protection.

Oracle Data Guard mode	Behavior	Patroni configuration	CloudNativePG configuration
Max Performance oracle default	Async; fastest commits; possible data loss on failover	patroni default	cnpg default
Max Availability (NOAFFIRM)	Sync when standby available; acknowledge after standby write (not flush); if none available, don’t block	`synchronous_mode: true` `synchronous_commit: remote_write`	`method: any` `number: 1` `dataDurability: preferred` `synchronous_commit: remote_write`
Max Availability (AFFIRM)	Sync when standby available; acknowledge after standby flush; if none available, don’t block	`synchronous_mode: true`	`method: any` `number: 1` `dataDurability: preferred`
Max Protection	Always sync; if no sync standby, block commits (no data loss)	`synchronous_mode: true` `synchronous_mode_strict: true`	`method: any` `number: 1`

Automated failovers can involve a small amount of data loss with maximum performance and maximum availability configurations. With Oracle Fast-Start Failover, the FastStartFailoverLagLimit configuration property indicates the maximum amount of data loss that is permissible in order for an automatic failover to occur.

Theprevious blog post in this series compared CloudNativePG Max Performance and Max Protection modes. Now I want to take a look at Max Availability. In CloudNativePG, the key setting here is spec.postgresql.synchronous.dataDurability. When dataDurability is set to preferred, the required number of synchronous instances adjusts based on the number of available standbys. PostgreSQL will attempt to replicate WAL records to the designated number of synchronous standbys, but write operations will continue even if fewer than the requested number of standbys are available.

All of these experiments were executed on my HP EliteBook (Ryzen Pro 5) with two CNPG Lab VMs via Hyper‑V and the tests ran in a loop for 12–24 hours to aggregate failure rates across the runs.

Experiment 1

Using the same test harness as before to indice rapid failures. The test harness waits for all replicas to be READY (per k8s) and then immediately kills the writer.

Hypothesis: in max protection mode we won’t see any data loss, but we will see data loss in max availability mode. Adding a third node to the cluster should reduce the likelihood of data loss.

dataDurability	instances	runs showing data loss
required	2	0% [results]
preferred	2	48% [results]
preferred	3	4% [results]

Findings: Setting dataDurability: preferred in CloudNativePG allows for higher availability but can result in data loss during failover, especially in smaller clusters. I was surprised how much the third node helped.

Experiment 2

Hypothesis A: I was seeing a high failure rate specifically because the rapid failures were triggering a failover before CloudNativePG had enough time to restart synchronous replication after the last failure. If there are 60 seconds between each failure, then we shouldn’t see any data loss.

Hypothesis B: CloudNativePG has a failoverDelay setting which can inject a delay before the CNPG reconciliation loop triggers a failover when the primary is unhealthy. If we set this to 60 seconds then we shouldn’t see any data loss.

nb. I also switched to running the latest development build from the trunk of CloudNativePG. (Separately, I had wanted to test some code that was checked in the day before I ran these tests.)

seconds between kills	failoverDelay	runs showing data loss
0	0	40% [results]
60	0	4% [results]
0	60	0% [results]

Findings: Introducing a delay – either by spacing out failures or by configuring failoverDelay – dramatically reduced or eliminated data loss in preferred mode. When failures occurred back-to-back with no delay, data loss was frequent. However, waiting 60 seconds between failures, or setting a 60-second failoverDelay, allowed CloudNativePG enough time to reestablish synchronous replication, resulting in little or no data loss.

What this means

CloudNativePG’s preferred data durability mode offers data safety and high availability with lower-cost two-node clusters by allowing commits to proceed even if the synchronous standby is temporarily unavailable. However, this flexibility comes with a small risk of data loss during failover, especially when failures happen in rapid succession. Introducing delays via the failoverDelay setting minimizes risk. For environments where data durability is paramount, three-node clusters in required mode remain the safest choice, but for those willing to trade a small risk of data loss for improved availability, two-node clusters in preferred mode can be a practical option. Consider settingfailoverDelay alongside preferred durability for extra safety.

About Jeremy

Building and running reliable data platforms that scale and perform.about.me/jeremy_schneider

View all posts by Jeremy»

« Data Safety on a Budget

Discussion

No comments yet.

Leave a New CommentCancel reply

This site uses Akismet to reduce spam.Learn how your comment data is processed.

Disclaimer

This is my personal website. The views expressed here are mine alone and may not reflect the views of my employer.I am currently looking for consulting and/or contracting work in the USA around the oracle database ecosystem.

contact:312-725-9249 orschneider @ ardentperf.com