Movatterモバイル変換


[0]ホーム

URL:


Ardent Performance Computing

Jeremy Schneider

Search

    >
    kubernetes,Planet,PostgreSQL,Technical

    Testing CloudNativePG Preferred Data Durability

    Posted byLeave a comment
    Filed Under  ,,,,,,,,,,,

    This is the third post about running Jepsen against CloudNativePG. Earlier posts:

    First: shout out to whoever first came up with Oracle Data Guard Protection Modes. Designing it to be explained as a choice between performance, availability and protection was a great idea.

    Yesterday’s blog post described how the core of all data safety iscopies of the data, and the importance of efficient architectures to meet data safety requirements.

    With Postgres, three-node clusters ensure the highest level of availability if one host fails. But two-node clusters are often worth the cost savings in exchange for a few seconds of unavailability during cluster reconfigurations. Similar to Oracle, Postgres two-node clusters can be configured to maximize performance or availability or protection.

    Oracle Data Guard modeBehaviorPatroni configurationCloudNativePG configuration
    Max Performance 
    oracle default
    Async; fastest commits; possible data loss on failoverpatroni defaultcnpg default
    Max Availability (NOAFFIRM)Sync when standby available; acknowledge after standby write (not flush); if none available, don’t blocksynchronous_mode: true
    synchronous_commit: remote_write
    method: any
    number: 1
    dataDurability: preferred
    synchronous_commit: remote_write
    Max Availability (AFFIRM)Sync when standby available; acknowledge after standby flush; if none available, don’t blocksynchronous_mode: truemethod: any
    number: 1
    dataDurability: preferred
    Max ProtectionAlways sync; if no sync standby, block commits (no data loss)synchronous_mode: true
    synchronous_mode_strict: true
    method: any
    number: 1

    Automated failovers can involve a small amount of data loss with maximum performance and maximum availability configurations. With Oracle Fast-Start Failover, the FastStartFailoverLagLimit configuration property indicates the maximum amount of data loss that is permissible in order for an automatic failover to occur.

    Theprevious blog post in this series compared CloudNativePG Max Performance and Max Protection modes. Now I want to take a look at Max Availability. In CloudNativePG, the key setting here is spec.postgresql.synchronous.dataDurability. When dataDurability is set to preferred, the required number of synchronous instances adjusts based on the number of available standbys. PostgreSQL will attempt to replicate WAL records to the designated number of synchronous standbys, but write operations will continue even if fewer than the requested number of standbys are available.

    All of these experiments were executed on my HP EliteBook (Ryzen Pro 5) with two CNPG Lab VMs via Hyper‑V and the tests ran in a loop for 12–24 hours to aggregate failure rates across the runs.

    Experiment 1

    Using the same test harness as before to indice rapid failures. The test harness waits for all replicas to be READY (per k8s) and then immediately kills the writer.

    Hypothesis: in max protection mode we won’t see any data loss, but we will see data loss in max availability mode. Adding a third node to the cluster should reduce the likelihood of data loss.

    dataDurabilityinstancesruns showing data loss
    required20% [results]
    preferred248% [results]
    preferred34% [results]

    Findings: Setting dataDurability: preferred in CloudNativePG allows for higher availability but can result in data loss during failover, especially in smaller clusters. I was surprised how much the third node helped.

    Experiment 2

    Hypothesis A: I was seeing a high failure rate specifically because the rapid failures were triggering a failover before CloudNativePG had enough time to restart synchronous replication after the last failure. If there are 60 seconds between each failure, then we shouldn’t see any data loss.

    Hypothesis B: CloudNativePG has a failoverDelay setting which can inject a delay before the CNPG reconciliation loop triggers a failover when the primary is unhealthy. If we set this to 60 seconds then we shouldn’t see any data loss.

    nb. I also switched to running the latest development build from the trunk of CloudNativePG. (Separately, I had wanted to test some code that was checked in the day before I ran these tests.)

    seconds between killsfailoverDelayruns showing data loss
    0040% [results]
    6004% [results]
    0600% [results]

    Findings: Introducing a delay – either by spacing out failures or by configuring failoverDelay – dramatically reduced or eliminated data loss in preferred mode. When failures occurred back-to-back with no delay, data loss was frequent. However, waiting 60 seconds between failures, or setting a 60-second failoverDelay, allowed CloudNativePG enough time to reestablish synchronous replication, resulting in little or no data loss.

    What this means

    CloudNativePG’s preferred data durability mode offers data safety and high availability with lower-cost two-node clusters by allowing commits to proceed even if the synchronous standby is temporarily unavailable. However, this flexibility comes with a small risk of data loss during failover, especially when failures happen in rapid succession. Introducing delays via the failoverDelay setting minimizes risk. For environments where data durability is paramount, three-node clusters in required mode remain the safest choice, but for those willing to trade a small risk of data loss for improved availability, two-node clusters in preferred mode can be a practical option. Consider settingfailoverDelay alongside preferred durability for extra safety.

    Unknown's avatar

    About Jeremy

    Building and running reliable data platforms that scale and perform.about.me/jeremy_schneider

    Discussion

    No comments yet.

    Leave a New CommentCancel reply

    This site uses Akismet to reduce spam.Learn how your comment data is processed.

    Disclaimer

    This is my personal website. The views expressed here are mine alone and may not reflect the views of my employer.I am currently looking for consulting and/or contracting work in the USA around the oracle database ecosystem.

    contact:312-725-9249 orschneider @ ardentperf.com


    https://about.me/jeremy_schneider

    oaktableocmaceracattack

    (a)

    Enter your email address to receive notifications of new posts by email.

    Email Address:

    Join 73 other subscribers

    Recent Posts

    Recent Comments

    Ardent Performance Computing

    Create a website or blog at WordPress.com


    [8]ページ先頭

    ©2009-2025 Movatter.jp