Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

NRG: WAL requires repair after truncation#7587

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
MauriceVanVeen wants to merge5 commits intomain
base:main
Choose a base branch
Loading
frommaurice/nrg-truncate-recovery

Conversation

@MauriceVanVeen
Copy link
Member

@MauriceVanVeenMauriceVanVeen commentedNov 27, 2025
edited
Loading

The WAL was assumed to never be corrupted, which would lead to truncation and the chance for state to diverge. This PR fixes that by:

  • Marking an empty or truncated log as requiring to be "repaired". This means this server will need to be caught up to where the log is meant to be for it to be marked fully operational again.
  • When recovering an empty log and catching up a single item and being restarted we'd lose the knowledge of the log needing to be repaired. We now persist this by writing an (empty) file namedrepair.idx which persists for the lifetime of the repair needing to happen. Since a lookup or write to this file only happens on startup or when the repair completes, this is not expensive to do. (This information could technically be added to a file liketav.idx and be persisted along with the term and vote, but that file is not extensible currently, requiring a separate file for now)
  • We can now still recover and not diverge properly even if a majority of nodes has been corrupted and/or truncated needing repair. As long as there's a single server with a complete log, we'll reliably halt until all servers agree this single server is the only one eligible to become leader.
  • We could also desync if there was an outdated server and only a single server with a corrupted/truncated log (if these two can form a majority for R3). The server requiring repair will now NOT count the outdated server's vote toward the majority vote, preventing it from becoming leader. The server that contains the largest log will become leader once it becomes available again.

This PR builds on the concept of "empty votes" and fixes some bugs in the PR that introduced this:#7038. Like highlighted in that PR but including this PR's fixes, it means that:

  • If a majority with valid logs is available, the one with the most up-to-date log will become leader.
  • If there's no majority, ALL servers need to be available to make a decision. Raft will be halted until that time.
  • This ensures that, as long as there's a single server with the data, this can be reliably restored.
  • If all servers have lost some data, the server with the most up-to-date log will become leader.
  • If all servers have lost all data, a random server will become leader.

The latter two points are technically unsafe, since normally a Raft-based system is meant to halt. But, this is where we prefer the system to become available again. Illustrating this in a simpler example: for a R3 in-memory stream, two servers can be restarted and lose all data, but the data will not be entirely lost as long as there is a single server still containing this data. However, if all servers were restarted and all data was (obviously) lost, we'd rather not halt but instead take the loss of data and continue operation (while ensuring all servers agree on the state of log, albeit reset). This makes us try as best as we can to preserve the log, but if against all odds the data ends up lost then we'd rather not brick the system to the point of requiring manual intervention.

Additionally, the user previously had no way of knowing this happened. Now we'll log the following after all three servers hosting a R3 in-memory stream are restarted such that all data of the stream was lost:

[WRN] RAFT [yrzKKRBu - S-R3M-06J0M6mt] Self got voted leader by all servers, restarting WAL with 0 entries, the log was fully lost

If only a single server with the in-memory data was left and was catching up followers but got shutdown halfway. The most up-to-date follower with partial data will become the new leader and the above log message will print but with "the log was partially reset".

Signed-off-by: Maurice van Veengithub@mauricevanveen.com

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
@MauriceVanVeenMauriceVanVeenforce-pushed themaurice/nrg-truncate-recovery branch from693c34e to3624b16CompareNovember 27, 2025 14:48
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
@MauriceVanVeenMauriceVanVeenforce-pushed themaurice/nrg-truncate-recovery branch from2c7e6fe toc781449CompareNovember 28, 2025 10:00
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

@MauriceVanVeen

[8]ページ先頭

©2009-2025 Movatter.jp