- Notifications
You must be signed in to change notification settings - Fork810
Add batch_size(batch_size) to __find_in_batches (Mongoid)#1036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Open
sylvain-8422 wants to merge1 commit intoelastic:mainChoose a base branch fromsylvain-8422:mongoid-batch-size
base:main
Could not load branches
Branch not found:{{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline, and old review comments may become outdated.
Uh oh!
There was an error while loading.Please reload this page.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
shashankjo approved these changesJul 4, 2023
ef8985e toaa38a1bCompareAuthor
sylvain-8422 commentedJul 6, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Same simple change as before, but I fixed the conflict created by whitespace changes in |
shashankjo approved these changesJul 7, 2023
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading.Please reload this page.
Add
.batch_size(batch_size)to#__find_in_batches(Mongoid).Fixes#1037 .
Although
.each_slice(batch_size)is useful in order to limit how many documents are sent to Elasticsearch at a time, it does nots limit the batch size of MongoDB'sgetMorecommands.By default, iterating over a MongoDB collection will first return 101 documents, and then subsequent batches of 16 MiB :
https://www.mongodb.com/docs/manual/tutorial/iterate-a-cursor/#cursor-batches
For example, a MongoDB collection containing documents averaging 1 KiB might return more than 16,000 documents at a time.
Although Mongoid claims in its documentation a default batch size of 1,000 documents, it does not seem to be the case.
Also, Mongoid's
.no_timeoutis broken right now and does nothing:mongodb/mongo-ruby-driver#2557
It is now likely that more than 10 minutes go by between two
getMorecommands and that the MongoDB cursor expires.Adding
.batch_size(batch_size)to the query makes sure that MongoDB documents are retrieved at the same rate as they are processed and indexed in Elasticsearch, and allow applications affected by the.no_timeoutissue to reduce the batch size to avoid cursor timeouts.