Movatterモバイル変換

vllm-project/vllm-ascendPublic

NotificationsYou must be signed in to change notification settings
Fork272
Star898

Community maintained hardware plugin for vLLM on Ascend

vllm-ascend.readthedocs.io

License

Apache-2.0 license

898 stars 272 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 562 Commits
.github		.github
benchmarks		benchmarks
cmake		cmake
csrc		csrc
docs		docs
examples		examples
tests		tests
tools		tools
vllm_ascend		vllm_ascend
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CMakeLists.txt		CMakeLists.txt
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DCO		DCO
Dockerfile		Dockerfile
Dockerfile.310p		Dockerfile.310p
Dockerfile.310p.openEuler		Dockerfile.310p.openEuler
Dockerfile.a3		Dockerfile.a3
Dockerfile.a3.openEuler		Dockerfile.a3.openEuler
Dockerfile.openEuler		Dockerfile.openEuler
LICENSE		LICENSE
README.md		README.md
README.zh.md		README.zh.md
codecov.yml		codecov.yml
collect_env.py		collect_env.py
format.sh		format.sh
mypy.ini		mypy.ini
packages.txt		packages.txt
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements-lint.txt		requirements-lint.txt
requirements.txt		requirements.txt
setup.py		setup.py
typos.toml		typos.toml

Repository files navigation

vLLM Ascend Plugin

English |中文

Latest News 🔥

[2025/06]User stories page is now live! It kicks off with ‌LLaMA-Factory/verl//TRL/GPUStack‌ to demonstrate how ‌vLLM Ascend‌ assists Ascend users in enhancing their experience across fine-tuning, evaluation, reinforcement learning (RL), and deployment scenarios.
[2025/06]Contributors page is now live! All contributions deserve to be recorded, thanks for all contributors.
[2025/05] We've released first official versionv0.7.3! We collaborated with the vLLM community to publish a blog post sharing our practice:Introducing vLLM Hardware Plugin, Best Practice from Ascend NPU.
[2025/03] We hosted thevLLM Beijing Meetup with vLLM team! Please find the meetup slideshere.
[2025/02] vLLM community officially createdvllm-project/vllm-ascend repo for running vLLM seamlessly on the Ascend NPU.
[2024/12] We are working with the vLLM community to support[RFC]: Hardware pluggable.

Overview

vLLM Ascend (vllm-ascend) is a community maintained hardware plugin for running vLLM seamlessly on the Ascend NPU.

It is the recommended approach for supporting the Ascend backend within the vLLM community. It adheres to the principles outlined in the[RFC]: Hardware pluggable, providing a hardware-pluggable interface that decouples the integration of the Ascend NPU with vLLM.

By using vLLM Ascend plugin, popular open-source models, including Transformer-like, Mixture-of-Expert, Embedding, Multi-modal LLMs can run seamlessly on the Ascend NPU.

Prerequisites

Hardware: Atlas 800I A2 Inference series, Atlas A2 Training series
OS: Linux
Software:
- Python >= 3.9, < 3.12
- CANN >= 8.1.RC1
- PyTorch >= 2.5.1, torch-npu >= 2.5.1.post1.dev20250619
- vLLM (the same version as vllm-ascend)

Getting Started

Please use the following recommended versions to get started quickly:

Version	Release type	Doc
v0.9.2rc1	Latest release candidate	QuickStart andInstallation for more details
v0.7.3.post1	Latest stable version	QuickStart andInstallation for more details

Contributing

SeeCONTRIBUTING for more details, which is a step-by-step guide to help you set up development environment, build and test.

We welcome and value any contributions and collaborations:

Please let us know if you encounter a bug byfiling an issue
Please useUser forum for usage questions and help.

Branch

vllm-ascend has main branch and dev branch.

main: main branch，corresponds to the vLLM main branch, and is continuously monitored for quality through Ascend CI.
vX.Y.Z-dev: development branch, created with part of new releases of vLLM. For example,v0.7.3-dev is the dev branch for vLLMv0.7.3 version.

Below is maintained branches:

Branch	Status	Note
main	Maintained	CI commitment for vLLM main branch and vLLM 0.9.x branch
v0.7.1-dev	Unmaintained	Only doc fixed is allowed
v0.7.3-dev	Maintained	CI commitment for vLLM 0.7.3 version, only bug fix is allowed and no new release tag any more.
v0.9.1-dev	Maintained	CI commitment for vLLM 0.9.1 version