Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

server: validate n_batch == n_ubatch for embeddings (#6263)#18123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
yifant-code wants to merge1 commit intoggml-org:master
base:master
Choose a base branch
Loading
fromyifant-code:fix/embedding-ubatch-validation

Conversation

@yifant-code
Copy link
Contributor

Fixes#6263

Problem

Server accepts mismatched--batch-size and--ubatch-size values when--embedding is enabled, leading to incoherent configuration.

Embeddings use non-causal attention which requires all tokens in a single ubatch (n_batch == n_ubatch). Default values differ (n_batch=2048, n_ubatch=512), so users frequently encounter this issue.

Solution

Add parameter validation inmain():

  • Detect when--embedding enabled andn_batch != n_ubatch
  • Log warnings explaining the requirement
  • Automatically set both tomin(n_batch, n_ubatch)

Uses auto-correction approach (suggested by@mirekphd) for better UX than strict rejection.

Testing

✅ Builds successfully
✅ Validation triggers:-b 2048 -ub 512 --embedding → logs warnings, sets both=512
✅ No false positives:-b 512 -ub 512 --embedding → silent
✅ Tested on macOS M3 Pro with embedding model

Fixesggml-org#6263 where server accepts mismatched batch/ubatch values withembeddings, leading to suboptimal or incorrect behavior.Problem: Embeddings and reranking use non-causal attention which requiresall tokens to be processed within a single ubatch. When n_batch != n_ubatch,the configuration is incoherent. Default values differ (n_batch=2048,n_ubatch=512), so users encounter this frequently.Solution:- Add parameter validation in main() after common_params_parse()- When embeddings enabled and n_batch != n_ubatch:  * Log warnings explaining the requirement  * Automatically set both to min(n_batch, n_ubatch)  * Ensure coherent configurationThis follows the auto-correction approach suggested by@mirekphdand provides better UX than strict rejection.Testing:✅ Builds successfully✅ Validation triggers: -b 2048 -ub 512 --embedding → logs warnings, adjusts both to 512✅ No false positives: -b 512 -ub 512 --embedding → no warnings✅ Verified on macOS M3 Pro with embedding model
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@ngxsonngxsonAwaiting requested review from ngxsonngxson will be requested when the pull request is marked ready for reviewngxson is a code owner

@ggerganovggerganovAwaiting requested review from ggerganovggerganov will be requested when the pull request is marked ready for reviewggerganov is a code owner

At least 1 approving review is required to merge this pull request.

Assignees

No one assigned

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

server: exit failure if--embedding is set with an incoherent--ubatch-size

1 participant

@yifant-code

[8]ページ先頭

©2009-2025 Movatter.jp