- Notifications
You must be signed in to change notification settings - Fork655
Pull requests: InternLM/lmdeploy
Author
Uh oh!
There was an error while loading.Please reload this page.
Label
Uh oh!
There was an error while loading.Please reload this page.
Projects
Uh oh!
There was an error while loading.Please reload this page.
Milestones
Uh oh!
There was an error while loading.Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading.Please reload this page.
Sort
Pull requests list
fix: change debug log from ERROR to DEBUG in RepetitionPenaltyKernel
#4363 openedFeb 15, 2026 bymurray-macdonaldLoading…
ci(lint): skip flaky deadlink test for python wiki page
#4357 openedFeb 13, 2026 bywindreamerLoading…
Fix XGrammar bitmask initialization and add null check for gen_config in generate method
#4349 openedFeb 11, 2026 bywindreamerLoading…
add preliminary support for EP(single-node) of turbomind backend
#4332 openedFeb 6, 2026 byirexycLoading…
Qwen/Internlm/Llama Dense/Moe model fp8 quant online enhancementNew feature or request
#4324 openedFeb 5, 2026 by43758726Loading…
Compatible with transformers 5.0 at TurboMind side improvement
#4304 openedJan 28, 2026 bylvhan028Loading…
change ascend paged attention from BSH format to TND format for better performace
#4295 openedJan 27, 2026 byjinminxi104 • Draft
support repetition ngram logits processor enhancementNew feature or request
#4288 openedJan 23, 2026 bygrimoireLoading…
Support fp32 head for qwen and internlm models improvement
#4160 openedNov 27, 2025 byRunningLeonLoading…
ProTip! Typegp on any issue or pull request to go back to the pull request listing page.