Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

rabbit_quorum_queue: Shrink batches of QQs in parallel#15081

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
the-mikedavis wants to merge1 commit intomain
base:main
Choose a base branch
Loading
frommd/parallel-shrink

Conversation

@the-mikedavis
Copy link
Collaborator

@the-mikedavisthe-mikedavis commentedDec 5, 2025
edited
Loading

Shrinking a member node off of a QQ can be parallelized. The operation involves

  • removing the node from the QQ's cluster membership (appending a command to the log and committing it) withra:remove_member/3
  • updating the metadata store to remove the member from the QQ type state withrabbit_amqqueue:update/2
  • deleting the queue data from the node withra:force_delete_server/2 if the node can be reached

All of these operations are I/O bound. Updating the cluster membership and metadata store involves appending commands to those logs and replicating them. Writing commands to Ra synchronously in serial is fairly slow - sending many commands in parallel is much more efficient. By parallelizing these steps we can write larger chunks of commands to WAL(s).

ra:force_delete_server/2 benefits from parallelizing if the node being shrunk off is no longer reachable, for example in some hardware failures. The underlyingrpc:call/4 will attempt to auto-connect to the node and this can take some time to time out. By parallelizing this, eachrpc:call/4 reuses the same underlying distribution entry and all calls fail together once the connection fails to establish.

Discussed in#15057

kjnilsson reacted with thumbs up emoji
@the-mikedavis
Copy link
CollaboratorAuthor

the-mikedavis commentedDec 5, 2025
edited
Loading

With this change and the default64 set here (just a sensible-seeming constant) I see my test in#15057 of shrinking from 1000 QQs go from taking ~2hrs to taking 1min52sec.

michaelklishin reacted with rocket emoji

@kjnilsson
Copy link
Contributor

This looks fine to me, at least for now.

It would be quite possible to get much higher throughput on this and use command pipelining instead of spawning a bunch of processes just to exercise the WAL more. We'd need to add that as an option to the Ra API however.

@the-mikedavis
Copy link
CollaboratorAuthor

Ah yeah, with pipelining we could use the WAL much more efficiently. That shouldn't be too bad to add to Ra - just a new function inra that would usera_server_proc:cast_command/3, right? Once mnesia is gone we could use Khepri async commands for the metadata store updates so both of those parts could be done with pipelining.

I'm actually more worried about thera:force_delete_server/2 part since that step can take a while (7 seconds) if the connection to the node times out. An easy way around that would be adding a function inrabbit_quorum_queue to callra:force_delete_server/2 on all queues after the membership and metadata store parts are done. Then it would just be one RPC call which could time out.

In the meantime making this parallel seems like an easy improvement since we can continue using thedelete_member/2 helper. But in the long run we should definitely use pipelining instead 👍

Shrinking a member node off of a QQ can be parallelized. The operationinvolves* removing the node from the QQ's cluster membership (appending a  command to the log and committing it) with `ra:remove_member/3`* updating the metadata store to remove the member from the QQ type  state with `rabbit_amqqueue:update/2`* deleting the queue data from the node with `ra:force_delete_server/2`  if the node can be reachedAll of these operations are I/O bound. Updating the cluster membershipand metadata store involves appending commands to those logs andreplicating them. Writing commands to Ra synchronously in serial isfairly slow - sending many commands in parallel is much more efficient.By parallelizing these steps we can write larger chunks of commands toWAL(s).`ra:force_delete_server/2` benefits from parallelizing if the node beingshrunk off is no longer reachable, for example in some hardwarefailures. The underlying `rpc:call/4` will attempt to auto-connect tothe node and this can take some time to time out. By parallelizing this,each `rpc:call/4` reuses the same underlying distribution entry andall calls fail together once the connection fails to establish.
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@michaelklishinmichaelklishinmichaelklishin left review comments

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

4 participants

@the-mikedavis@kjnilsson@michaelklishin

[8]ページ先頭

©2009-2025 Movatter.jp