Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Community maintained hardware plugin for vLLM on Ascend

License

NotificationsYou must be signed in to change notification settings

vllm-project/vllm-ascend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vllm-ascend

|About Ascend |Documentation |#sig-ascend |Users Forum |Weekly Meeting |

English |中文


Latest News 🔥

  • [2025/06]User stories page is now live! It kicks off with ‌LLaMA-Factory/verl//TRL/GPUStack‌ to demonstrate how ‌vLLM Ascend‌ assists Ascend users in enhancing their experience across fine-tuning, evaluation, reinforcement learning (RL), and deployment scenarios.
  • [2025/06]Contributors page is now live! All contributions deserve to be recorded, thanks for all contributors.
  • [2025/05] We've released first official versionv0.7.3! We collaborated with the vLLM community to publish a blog post sharing our practice:Introducing vLLM Hardware Plugin, Best Practice from Ascend NPU.
  • [2025/03] We hosted thevLLM Beijing Meetup with vLLM team! Please find the meetup slideshere.
  • [2025/02] vLLM community officially createdvllm-project/vllm-ascend repo for running vLLM seamlessly on the Ascend NPU.
  • [2024/12] We are working with the vLLM community to support[RFC]: Hardware pluggable.

Overview

vLLM Ascend (vllm-ascend) is a community maintained hardware plugin for running vLLM seamlessly on the Ascend NPU.

It is the recommended approach for supporting the Ascend backend within the vLLM community. It adheres to the principles outlined in the[RFC]: Hardware pluggable, providing a hardware-pluggable interface that decouples the integration of the Ascend NPU with vLLM.

By using vLLM Ascend plugin, popular open-source models, including Transformer-like, Mixture-of-Expert, Embedding, Multi-modal LLMs can run seamlessly on the Ascend NPU.

Prerequisites

  • Hardware: Atlas 800I A2 Inference series, Atlas A2 Training series
  • OS: Linux
  • Software:
    • Python >= 3.9, < 3.12
    • CANN >= 8.1.RC1
    • PyTorch >= 2.5.1, torch-npu >= 2.5.1.post1.dev20250619
    • vLLM (the same version as vllm-ascend)

Getting Started

Please use the following recommended versions to get started quickly:

VersionRelease typeDoc
v0.9.2rc1Latest release candidateQuickStart andInstallation for more details
v0.7.3.post1Latest stable versionQuickStart andInstallation for more details

Contributing

SeeCONTRIBUTING for more details, which is a step-by-step guide to help you set up development environment, build and test.

We welcome and value any contributions and collaborations:

Branch

vllm-ascend has main branch and dev branch.

  • main: main branch,corresponds to the vLLM main branch, and is continuously monitored for quality through Ascend CI.
  • vX.Y.Z-dev: development branch, created with part of new releases of vLLM. For example,v0.7.3-dev is the dev branch for vLLMv0.7.3 version.

Below is maintained branches:

BranchStatusNote
mainMaintainedCI commitment for vLLM main branch and vLLM 0.9.x branch
v0.7.1-devUnmaintainedOnly doc fixed is allowed
v0.7.3-devMaintainedCI commitment for vLLM 0.7.3 version, only bug fix is allowed and no new release tag any more.
v0.9.1-devMaintainedCI commitment for vLLM 0.9.1 version

Please refer toVersioning policy for more details.

Weekly Meeting

License

Apache License 2.0, as found in theLICENSE file.


[8]ページ先頭

©2009-2025 Movatter.jp