- Notifications
You must be signed in to change notification settings - Fork1.2k
Fix flashinfer plan call to use positional arguments for #3165#3166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
cu_seqlens, | ||
indptr, | ||
block_tables, | ||
last_page_len, | ||
num_heads, | ||
num_kv_heads, | ||
head_size, | ||
page_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I don't think moving away from kwargs is the right solution. The error show exactly why using kwargs is good - it catches changes to arguments. We should update to the latest flashinfer, see:https://github.com/huggingface/text-generation-inference/pull/3164/files
But a lot of the test outputs change after upgrading, I haven't had the time yet to go through them and see if they are all minor fluctuations or not.
What does this PR do?
This changes the call to BatchPrefillWithPagedKVCacheWrapper.plan to use positional arguments instead of key value arguments. Not sure why python is acting like this
Fixes#3165
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@Narsil@danieldk - please review