Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[Messenger] Batch handlers v2#47090

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
upyx wants to merge1 commit intosymfony:6.2fromupyx:messenger-rfc-batch-handler

Conversation

@upyx
Copy link
Contributor

QA
Branch?6.2
Bug fix?no
New feature?yes
Deprecations?possible...
Ticketsnope
LicenseMIT
Doc PRTBD

The Messenger component is a good core of asynchronous applications. Since my team utilizes event-driven architecture, we heavily rely on the library. We've been collecting many ideas on improvement over the past years. Most of them are about extending and flexibility, so they require internal changes to the library. Unfortunately, some changes block each other, so it's difficult to do them piece by piece. Nevertheless, I will try. The first small piece is below.

Introdution

The common example of batch handler (merged in#43354) is:

class MyBatchHandlerimplements BatchHandlerInterface{use BatchHandlerTrait;publicfunction__invoke(MyMessage$message,Acknowledger$ack =null)    {return$this->handle($message,$ack);    }privatefunctionprocess(array$jobs):void    {foreach ($jobsas [$message,$ack]) {try {// [...] compute $result from $message$ack->ack($result);            }catch (\Throwable$e) {$ack->nack($e);            }        }    }}

Where thecompute $result from $message is a place forreal handler job. The other things are part of batch processing itself, which the handler shouldn't care about.

Proposal

In my vision, the ideal batch handler would look like this:

#[AsMessageHandler(batchStrategy:newBatchCount(10))]class MyBatchHandler{publicfunction__invoke(MyMessage ...$messages)    {// handle $messages    }}

Or an interface alternative to an attribute:

class MyBatchHandlerimplements BatchStrategyProviderInterface{publicfunction__invoke(MyMessage ...$messages)    {// handle $messages    }publicfunctiongetBatchStrategy():BatchStrategyInterface    {returnnewCountBatchStrategy(10);    }}

A bit easier, isn't it? However, this way disables the features which we are used to: separate acknowledging and returning results. There is an option to return them, I'll descire it below.

How batches work

A batch processing consists of parts:

  1. Buffering
  2. Handling the buffer
  3. Acknowledge received messages

The first part (buffering) can be easily done by the library internals. The only thing it needs is trigger where to flush the buffer (run a handler). The second part is the handler's job. And the acknowledging... hm, it's some opinionable part.

Triggering a handler

Generally, rules for triggering a handler are more complex than just message counting. Most likely they would look like: "do not process a batch faster than once per 10 seconds if the buffer is less than 100 messages". Rules can be different, and most likely they would be shared between handlers in a group.

class MyBatchStrategyimplements BatchStrategyInterface{privateint$bufferSize =0;privateint$lastHandledNanoseconds =0;publicfunctionshouldHandle(object$lastMessage):bool    {return ($lastMessage->forceHandling)            || (++$this->bufferSize >=100)            || (\hrtime(true) -$this->lastHandledNanoseconds >1_000_000_000);    }publicfunctionbeforeHandle():void    {$this->bufferSize =0;    }publicfunctionafterHandle():void    {$this->lastHandledNanoseconds =\hrtime(true);    }}

Acknowledging

I believe that a batch must be acknowledged or rejected entirely at once. Because it's a batch. If someone needs to acknowledge messages separately, it means that there is no batch but some infrastructure specific optimization or misuse. So, if a handler finishes without an error, the entire batch should be acknowledged.

Returning results

Handlers shouldn't care how they are run synchronously or asynchronously, they should do their job interchangeably and never return a result, but save it or dispatch further.Here is a perfect explanation of the difference between Messenger and Event Dispatcher. It refers to the main architecture idea of the Messenger which makes it so powerful: itnever communicates back.

The possibility of returning results through synchronous transport is a historical accident and a misuse of the tool.

Use legacy features

I absolutely sure they are useless. However, it's might be overkill to change all legacy code. As I promised, there is a way to use them:

#[AsMessageHandler(batchStrategy:newBatchCount(10))]class MyBatchHandler{publicfunction__invoke(Result$r,MyMessage ...$messages)    {$results =$errors = [];try {// compute $results from $messages        }catch (\Throwable$e) {// save $errors        }foreach ($resultsas [$message,$result])$r->ack($message,$result);foreach ($errorsas [$message,$error])$r->reject($message,$error);    }}

TheResult object has aSplObjectStorage that maps messeges withAcknowledger, so it can be shared.

Deprecations

I'd like to deprecateBatchHandlerInterface,Acknowledger andHandledStamp::getResult().

Implementation

There is no proper implementation yet. The PR has a dirty draft to show an idea and play with it. Some tests failed because of a new optional argument inHandlerDescriptor, but it works.

@carsonbotcarsonbot added Status: Needs Review Feature Messenger RFCRFC = Request For Comments (proposals about features that you want to be discussed) labelsJul 28, 2022
@carsonbotcarsonbot added this to the6.2 milestoneJul 28, 2022
@carsonbotcarsonbot changed the title[RFC][Messenger] Batch handlers v2[Messenger] Batch handlers v2Jul 28, 2022
@nicolas-grekas
Copy link
Member

Hi, thanks for the proposal and for opening the discussion.

I believe that a batch must be acknowledged or rejected entirely at once. Because it's a batch. If someone needs to acknowledge messages separately, it means that there is no batch but some infrastructure specific optimization or misuse. So, if a handler finishes without an error, the entire batch should be acknowledged.

I have a different opinion here: acknowledging each message separately provides flexibility in handling batches. Eg one can process N messages in parallel and ack/nack as needed. This is a desired feature to me. On the contrary, acknowledging the batch itself would give a false impression that batches are transactions. They're not, since we can't guarantee anything from a transactional pov.

I fear that this difference invalidates most of the proposal :(

lyrixx and b1rdex reacted with thumbs up emoji

@upyx
Copy link
ContributorAuthor

I have a different opinion here

So let's discuss it :)

Eg one can process N messages in parallel and ack/nack as needed. This is a desired feature to me.

Interesting... How do batches help here? When I want parallel processing, I run N workers with "single" handler. Or I can create a custom middleware to process few messages in parallel by one worker with fiber (in PHP 8.2).

On the contrary, acknowledging the batch itself would give a false impression that batches are transactions. They're not, since we can't guarantee anything from a transactional pov.

We can't, it's true. However, a user can! And it's a typical thing thatusers I do with batch handlers.

Anyway, I've seen batches a lot, and have never (can't remember at least) seen that partially acknowledging helps. Could you share your experience and give some example?

I fear that this difference invalidates most of the proposal :(

Absolutely not :) TheResult object in the example is used to acknowledge and set results to messages separately.

@nicolas-grekas
Copy link
Member

nicolas-grekas commentedAug 1, 2022
edited
Loading

The Result object in the example is used to acknowledge and set results to messages separately

Note that the current design keeps LSP, aka batch handlers remain valid regular handlers. PuttingResult as first argument breaks this since batch handlers can't be called as regular handlers anymore. We'd need to preserve this capability.

@lyrixx
Copy link
Member

I agree with@nicolas-grekas about

acknowledging each message separately provides flexibility in handling batches


Anyway, I've seen batches a lot, and have never (can't remember at least) seen that partially acknowledging helps. Could you share your experience and give some example?

Some exemples

  • Instead of calling Elasticsearch each time something changes, we buffer everything and we call thebulk API. It much more efficient for ES. Then is a document cannot be inserted, we can reject the message and send it to a dead letter queue for later inspection.
  • Instead of calling a rate limited API (of translation, but we don't care), we buffer all message, and call it every X secondes, or when the buffer is full. (seehttps://jolicode.com/blog/batching-symfony-messenger-messages for more information - and an old implementation)

In both situation, we are not in a "transactional mode". So beeing able to ack all messages one by one is a must have feature.

About the API, I prefer a new interface too instead of a trait. But we already discussed that in the PR (or mine, I don't remember). And the trait was finally chosen.

@upyx
Copy link
ContributorAuthor

upyx commentedAug 2, 2022

The more features, the more code, the harder to keep things clear and flexible. I'm trying to simplify handlers. Especially batch handlers.

@nicolas-grekas

Note that the current design keeps LSP, aka batch handlers remain valid regular handlers. Putting Result as first argument breaks this since batch handlers can't be called as regular handlers anymore. We'd need to preserve this capability.

The capability preserved byHandlerDescriptor andHandlerLocator semantically. There is no superclass to keepLSP by language features.

I'd like to make a single (m.b. internal) interface for all kinds of handlers and few sequential adapters which simplify work for users and keep BC. So the user would make a choice to implement the fullfeatured interface or "just a handler" part. It moves the logic about batches, buffers, acknowledging, and so on from the middleware internals (see how simpleit was originally) and incapluletes it in places that are intended to.

Than you for examples@lyrixx !

Well, I see a lot of developers probably want to see that feature...

I won't argue for this.

seehttps://jolicode.com/blog/batching-symfony-messenger-messages for more information

We are using something close to that right now. As you can see, a batch handler processes a message with the whole batch.

lyrixx reacted with thumbs up emoji

@upyx
Copy link
ContributorAuthor

upyx commentedAug 2, 2022

Anyway, is this interesting propose or just close it?

@upyx
Copy link
ContributorAuthor

The answer is clear. Maybe next time...

@upyxupyx closed thisAug 23, 2022
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

FeatureMessengerRFCRFC = Request For Comments (proposals about features that you want to be discussed)Status: Needs Review

Projects

None yet

Milestone

6.2

Development

Successfully merging this pull request may close these issues.

4 participants

@upyx@nicolas-grekas@lyrixx@carsonbot

[8]ページ先頭

©2009-2025 Movatter.jp