Posted onJul 18

Understanding Durability in PostgreSQL The "D" in ACID

In our previous deep-dive articles on ACID, we exploredIsolation in PostgreSQL, understanding how isolation levels work under the hood. You can revisit that discussionhere.

It's been a few months since that article (life happens - juggling a job, projects, and everything in between), but I'm now back and committed to writing consistently. I craft these articles not only for you but also as a resource for my future self, which is why I delve deeply into specific subtopics and include practical prototypes for better clarity. Today, we're going to break down the final, crucial pillar of the ACID model:Durability.

1. What durability means?

Durability means that once atransaction is committed, its changesmust persist no matter what. Even if the database server crashes or there's a power loss, the committed changesmust still be there when the system comes back up.

Sounds simple? The internal mechanisms aren't.

2. How PostgreSQL manages durability under the hood

A common misconception especially among developers new to PostgreSQL is that when a write query is executed (likeINSERT,UPDATE, orDELETE), the changes areimmediately written to disk. Contrary to what many assume, PostgreSQL doesnot immediately write changes to disk when a query is executed. Instead, it follows a carefully layered process that balances speed with reliability.

Let’s examine that process in structured steps.

2.1 Modifications Begin in Memory: The Buffer Cache

All write operations in PostgreSQL first modify datain memory, not on disk. This memory region is known as thebuffer cache.

PostgreSQL stores all data in fixed-size blocks calleddata pages (typically 8KB).
When a row is updated or inserted, the relevant data page is loaded into memory (if not already present).
The modification is applied directly to thein-memory copy of the page.

This design greatly enhances performance by minimizing disk I/O. However, it introduces a serious risk:memory is volatile. If the system crashes now, those changes are gone.

So how does PostgreSQL protect those changes? That’s where the next mechanism comes in.

3. Write-Ahead Logging (WAL): Durability’s Foundation

To guarantee durability, PostgreSQL uses theWrite-Ahead Log (WAL) a journal-like structure that records what changes are about to happenbefore they’re applied to disk.

3.1 What the WAL Does

For every data modification, PostgreSQL generates a WAL record describing the change.
This WAL record is written andflushed to disk before the transaction is considered committed.
Only after this flush does PostgreSQL report the transaction as "successfully committed."

This ensures thateven if a crash occurs, PostgreSQL can use the WAL to replay andreapply changes upon recovery.

You can dive deeper into this concept in my article onAtomicity, where I also provide aprototype WAL simulation in Go to illustrate how the logging and recovery process works.

4. Dirty Pages and Deferred Disk Writes

Even after WAL is safely persisted, theactual data pages in memory are not immediately written to disk. These modified pages are known asdirty pages.

4.1 When Are Dirty Pages Flushed?

Checkpoints: Periodic events where PostgreSQL flushes dirty pages to disk to reduce recovery time.
Background Writer: A dedicated process that writes dirty pages incrementally, smoothing out I/O spikes.

This design allows PostgreSQL tobatch writes and avoid frequent disk operations, which boosts throughput without compromising durability since WAL is already safely stored.

To understand the interaction between buffer cache, background writer, and checkpointing in more depth, refer tothis analysis of PostgreSQL's memory components.

5. Crash Recovery: Why WAL Exists in the First Place

Let’s briefly imagine a systemwithout WAL. If all changes live only in memory, and the server crashes, those changes vanish even if users were told the transaction succeeded.

That’s acatastrophic failure of durability.

WAL ensures that PostgreSQL can recover by:

Scanning the WAL log on startup
Replaying committed changes to reconstruct the latest consistent state

This recovery process isautomatic and forms the backbone of PostgreSQL's fault tolerance.

6. Practical Implications: A Real-World Banking Scenario

Consider this use case: a user initiates abank transfer. The application confirms success, but the server crashes milliseconds later.

If durability is not guaranteed, that transaction might never make it to the ledger:

The sender’s balance was reduced in memory but not written to disk.
The recipient never receives the money.

Such a failure damages trust and introduces serious compliance issues. Durability mechanisms like WAL ensure thatconfirmed transactions remain valid, no matter what.

7. Distributed Transactions and Durability

Durability becomes even more challenging indistributed systems, where data spans multiple nodes or databases.

To ensure consistency across all participants, protocols like:

Two-Phase Commit (2PC)
Three-Phase Commit (3PC)
Consensus algorithms (Raft, Paxos)

...are used to make sure either:

All nodes commit the transaction
Or all abort ensuringatomicity and durability even in partial failures

While PostgreSQL itself is a single-node system, extensions likeCitus or external systems likeCockroachDB apply these principles in distributed settings.

8. Summary: Why Durability Is Non-Negotiable

Durability in PostgreSQL is not an optional feature it isintegral to correctness. Here's a recap of how it is achieved:

Changes are made to memory (buffer cache)
A WAL record is generated and flushed to disk before commit
Dirty pages remain in memory and are flushed later via checkpoints or background writer
WAL enables crash recovery by replaying committed changes

Without these safeguards,even successful transactions can silently vanish, leading to irreparable data loss.