NotificationsYou must be signed in to change notification settings
Fork1.7k
Star18.7k

NRG: WAL requires repair after truncation#7587

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Draft

MauriceVanVeen wants to merge5 commits intomain

base:main

Choose a base branch

frommaurice/nrg-truncate-recovery

Draft

NRG: WAL requires repair after truncation#7587

MauriceVanVeen wants to merge5 commits intomainfrommaurice/nrg-truncate-recovery

+414 −19

Conversation

Copy link

Member

MauriceVanVeen commentedNov 27, 2025•
edited
Loading

The WAL was assumed to never be corrupted, which would lead to truncation and the chance for state to diverge. This PR fixes that by:

Marking an empty or truncated log as requiring to be "repaired". This means this server will need to be caught up to where the log is meant to be for it to be marked fully operational again.
When recovering an empty log and catching up a single item and being restarted we'd lose the knowledge of the log needing to be repaired. We now persist this by writing an (empty) file namedrepair.idx which persists for the lifetime of the repair needing to happen. Since a lookup or write to this file only happens on startup or when the repair completes, this is not expensive to do. (This information could technically be added to a file liketav.idx and be persisted along with the term and vote, but that file is not extensible currently, requiring a separate file for now)
We can now still recover and not diverge properly even if a majority of nodes has been corrupted and/or truncated needing repair. As long as there's a single server with a complete log, we'll reliably halt until all servers agree this single server is the only one eligible to become leader.
We could also desync if there was an outdated server and only a single server with a corrupted/truncated log (if these two can form a majority for R3). The server requiring repair will now NOT count the outdated server's vote toward the majority vote, preventing it from becoming leader. The server that contains the largest log will become leader once it becomes available again.

This PR builds on the concept of "empty votes" and fixes some bugs in the PR that introduced this:#7038. Like highlighted in that PR but including this PR's fixes, it means that:

If a majority with valid logs is available, the one with the most up-to-date log will become leader.
If there's no majority, ALL servers need to be available to make a decision. Raft will be halted until that time.
This ensures that, as long as there's a single server with the data, this can be reliably restored.
If all servers have lost some data, the server with the most up-to-date log will become leader.
If all servers have lost all data, a random server will become leader.

The latter two points are technically unsafe, since normally a Raft-based system is meant to halt. But, this is where we prefer the system to become available again. Illustrating this in a simpler example: for a R3 in-memory stream, two servers can be restarted and lose all data, but the data will not be entirely lost as long as there is a single server still containing this data. However, if all servers were restarted and all data was (obviously) lost, we'd rather not halt but instead take the loss of data and continue operation (while ensuring all servers agree on the state of log, albeit reset). This makes us try as best as we can to preserve the log, but if against all odds the data ends up lost then we'd rather not brick the system to the point of requiring manual intervention.

Additionally, the user previously had no way of knowing this happened. Now we'll log the following after all three servers hosting a R3 in-memory stream are restarted such that all data of the stream was lost:

[WRN] RAFT [yrzKKRBu - S-R3M-06J0M6mt] Self got voted leader by all servers, restarting WAL with 0 entries, the log was fully lost

If only a single server with the in-memory data was left and was catching up followers but got shutdown halfway. The most up-to-date follower with partial data will become the new leader and the above log message will print but with "the log was partially reset".

Signed-off-by: Maurice van Veengithub@mauricevanveen.com

MauriceVanVeen added4 commits

November 27, 2025 15:08

NRG: Reset repairing/initializing after full catchup

ebdcd6c

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>

NRG: Continue WAL repair after restart

7aca3c0

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>

NRG: WAL requires repair if truncated

9751bab

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>

NRG: Can't become leader based on majority when repairing truncated log

3624b16

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>

MauriceVanVeen force-pushed themaurice/nrg-truncate-recovery branch from693c34e to3624b16Compare

November 27, 2025 14:48

NRG: Log when all servers decide to restart WAL

c781449

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>

MauriceVanVeen force-pushed themaurice/nrg-truncate-recovery branch from2c7e6fe toc781449Compare

November 28, 2025 10:00

Labels

None yet

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

NRG: WAL requires repair after truncation#7587

Are you sure you want to change the base?

NRG: WAL requires repair after truncation#7587

Conversation

MauriceVanVeen commentedNov 27, 2025•
edited
Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Movatterモバイル変換

Uh oh!

NRG: WAL requires repair after truncation#7587

Are you sure you want to change the base?

NRG: WAL requires repair after truncation#7587

Conversation

MauriceVanVeen commentedNov 27, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MauriceVanVeen commentedNov 27, 2025•
edited
Loading