Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Raft batching#7355

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
sciascid wants to merge5 commits intomain
base:main
Choose a base branch
Loading
fromraft-batching
Draft

Raft batching#7355

sciascid wants to merge5 commits intomainfromraft-batching

Conversation

@sciascid
Copy link
Contributor

Changes used to evaluate and improve batching at the Raft level.
These are proof-of-concepts, not necessarily complete nor sufficiently tested,
performance evaluation only!

@sciascid
Copy link
ContributorAuthor

sciascid commentedSep 25, 2025
edited
Loading

Setup:

3 node cluster, all running on a laptop, with synchronous writes (sync_interval: always).

Workload:

nats bench js pub --replicas=3 --clients=10 --msgs=100000 --create --purge --size=1024 test

Throughput:

OptimizationThroughput (msgs/s)
Baseline91
Async stream writes584
Async + Improved batching593
Async + Reduced lock contention21964
All combined22967

Batching effectiveness:

batch_comparison

An easy way to collect batch sizes.For performance testing only. Will be removed.
This is the baseline for perfomance testing Raft's batchingcapabilities. The behavior of the batching mechanism Raftis easier to observe if disk writes are synchronous.I.e we want to write() + fsync() the Raft log. So thatproducers can easily keep the proposal queue busy.To do so one can set "sync_interval= always". However, thatresults in disastrous performance: when the leader receivesacks for a "big" batch of log entries, the upper layer willwrite() and fsync() all entries in the batch, individually.So this commit disables "sync always" on stream writes.This *should* work in principle because the data is already inthe raft log. Alternatively, one could implement "group commit"for streams, i.e. fsync() only one time after processing a batchof entries.For performance testing only at this point.
This commit removes a "pathological" case from the current Raftbatching mechanism: if the proposal queue contains more entriesthan one batch can fit, then raft will send a full batch, followedby a small batch containing the leftovers.However, it was observed that its quite possible that while thefirst batch was being stored and sent, clients may already havepushed more stuff into the proposal queue in the meantime.With this fix the server will compose and send a full batch, thenthe leftovers are handled as follows: if more proposals were pushedinto the proposal queue, then we carry over the leftovers to thenext iteration. So that the leftovers are batched together with theproposals that were added pushed in the meantime.If there are no more proposals, then we send the leftovers right away.For performance testing only at point.
This is an attempt to reduce contention between Propose() andsendAppendEntry(). Change Propose() to acquire a read lock on Raft, andavoid locking Raft during storeToWAL() (which potentially does IO andmay take a long time). This works as long as sendAppendEntry() is calledfrom the Raft's goroutine only, unless the entry does not require to bestored to the Raft log. So the rest of the changes are for enforcing theabove requirement:  * Change EntryLeaderTransfer so that it is not store to the Raft log.  * Push EntryPeerState and EntrySnapshot entries to the proposal queue.  * Make sure EntrySnapshot entries skip the leader check, so make sure    those are not batched together with other entries.For performance testing only at this point.
Limit batch size based on the configured max_payload.
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

@sciascid

[8]ページ先頭

©2009-2025 Movatter.jp