- Notifications
You must be signed in to change notification settings - Fork272
Community maintained hardware plugin for vLLM on Ascend
License
vllm-project/vllm-ascend
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
|About Ascend |Documentation |#sig-ascend |Users Forum |Weekly Meeting |
English |中文
Latest News 🔥
- [2025/06]User stories page is now live! It kicks off with LLaMA-Factory/verl//TRL/GPUStack to demonstrate how vLLM Ascend assists Ascend users in enhancing their experience across fine-tuning, evaluation, reinforcement learning (RL), and deployment scenarios.
- [2025/06]Contributors page is now live! All contributions deserve to be recorded, thanks for all contributors.
- [2025/05] We've released first official versionv0.7.3! We collaborated with the vLLM community to publish a blog post sharing our practice:Introducing vLLM Hardware Plugin, Best Practice from Ascend NPU.
- [2025/03] We hosted thevLLM Beijing Meetup with vLLM team! Please find the meetup slideshere.
- [2025/02] vLLM community officially createdvllm-project/vllm-ascend repo for running vLLM seamlessly on the Ascend NPU.
- [2024/12] We are working with the vLLM community to support[RFC]: Hardware pluggable.
vLLM Ascend (vllm-ascend
) is a community maintained hardware plugin for running vLLM seamlessly on the Ascend NPU.
It is the recommended approach for supporting the Ascend backend within the vLLM community. It adheres to the principles outlined in the[RFC]: Hardware pluggable, providing a hardware-pluggable interface that decouples the integration of the Ascend NPU with vLLM.
By using vLLM Ascend plugin, popular open-source models, including Transformer-like, Mixture-of-Expert, Embedding, Multi-modal LLMs can run seamlessly on the Ascend NPU.
- Hardware: Atlas 800I A2 Inference series, Atlas A2 Training series
- OS: Linux
- Software:
- Python >= 3.9, < 3.12
- CANN >= 8.1.RC1
- PyTorch >= 2.5.1, torch-npu >= 2.5.1.post1.dev20250619
- vLLM (the same version as vllm-ascend)
Please use the following recommended versions to get started quickly:
Version | Release type | Doc |
---|---|---|
v0.9.2rc1 | Latest release candidate | QuickStart andInstallation for more details |
v0.7.3.post1 | Latest stable version | QuickStart andInstallation for more details |
SeeCONTRIBUTING for more details, which is a step-by-step guide to help you set up development environment, build and test.
We welcome and value any contributions and collaborations:
- Please let us know if you encounter a bug byfiling an issue
- Please useUser forum for usage questions and help.
vllm-ascend has main branch and dev branch.
- main: main branch,corresponds to the vLLM main branch, and is continuously monitored for quality through Ascend CI.
- vX.Y.Z-dev: development branch, created with part of new releases of vLLM. For example,
v0.7.3-dev
is the dev branch for vLLMv0.7.3
version.
Below is maintained branches:
Branch | Status | Note |
---|---|---|
main | Maintained | CI commitment for vLLM main branch and vLLM 0.9.x branch |
v0.7.1-dev | Unmaintained | Only doc fixed is allowed |
v0.7.3-dev | Maintained | CI commitment for vLLM 0.7.3 version, only bug fix is allowed and no new release tag any more. |
v0.9.1-dev | Maintained | CI commitment for vLLM 0.9.1 version |
Please refer toVersioning policy for more details.
- vLLM Ascend Weekly Meeting:https://tinyurl.com/vllm-ascend-meeting
- Wednesday, 15:00 - 16:00 (UTC+8,Convert to your timezone)
Apache License 2.0, as found in theLICENSE file.
About
Community maintained hardware plugin for vLLM on Ascend
Topics
Resources
License
Code of conduct
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.