Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Release v0.4.0

Latest

Choose a tag to compare

@Gregory-PereiraGregory-Pereira released this 26 Nov 20:19
· 2 commits to main since this release
04f3538
This commit was created on GitHub.com and signed with GitHub’sverified signature.
GPG key ID:B5690EEEBB952194
Verified
Learn about vigilant mode.

📦 llm-d v0.4.0 Release Notes

This release of thellm-d repo will capture the release for the entirety of the project, guides, components, and all.

Release Date: 2025-11-26


🧩 Component Summary

ComponentVersionPrevious VersionType
llmd/llm-d-inference-schedulerv0.4.0-rc.1v0.3.1Image
llm-d-incubation/llm-d-modelservicev0.3.8v0.2.10Helm Chart
llm-d/llm-d-routing-sidecarv0.4.0-rc.1v0.3.1Image
llm-d/llm-d-cudav0.4.0v0.3.1Image
llm-d/llm-d-awsv0.4.0v0.3.1Image
llm-d/llm-d-xpuv0.4.0v0.3.1Image
llm-d/llm-d-cpuv0.4.0v0.3.1Image (New)
llm-d-incubation/llm-d-infrav1.3.4v1.3.3Helm Chart
kubernetes-sig/gateway-api-inference-extensionv1.2.0-rc.1v1.0.1Helm Chart
llm-d/llm-d-workload-variant-autoscalerv0.0.8NA (new)Helm Chart + Image

🔹 lmd/llm-d-inference-scheduler

  • Description: This scheduler that makes optimized routing decisions for inference requests to the llm-d inference framework.
  • Diff:v0.3.1 → v0.4.0-rc.1

🔹 llm-d-incubation/llm-d-modelservice

  • Description:modelservice is a Helm chart that simplifies LLM deployment on llm-d by declaratively managing Kubernetes resources for serving base models. It enables reproducible, scalable, and tunable model deployments through modular presets, and clean integration with llm-d ecosystem components (including vLLM, Gateway API Inference Extension, LeaderWorkerSet).
  • Diff:v0.2.10 → v0.3.8

🔹 llm-d/llm-d-routing-sidecar

  • Description: A reverse proxy redirecting incoming requests to the prefill worker specified in the x-prefiller-host-port HTTP request header.
  • Diff:v0.3.1 → v0.4.0-rc.1

🔹 llm-d/llm-d

  • Description: A midstreamed image ofvllm-project/vllm for inferencing, supporting features such as PD disaggregation, KV cache awareness and more.
  • Diff:v0.3.1 → v0.4.0
  • Image Variants: Different image variants of this component:
    • XPU:ghcr.io/llm-d/llm-d-xpu:v0.4.0
    • AWS:ghcr.io/llm-d/llm-d-aws:v0.4.0
    • CUDA:ghcr.io/llm-d/llm-d-cuda:v0.4.0
    • CPU:ghcr.io/llm-d/llm-d-cpu:v0.4.0

🔹 llm-d-incubation/llm-d-infra

  • Description: A helm chart for deploying gateway and gateway related infrastructure assets for llm-d.
  • Diff:v1.3.3 → v1.3.4

🔹 kubernetes-sig/gateway-api-inference-extension

  • Description: A Helm chart to deploy an InferencePool, a corresponding EndpointPicker (epp) deployment, and any other related assets.
  • Diff:v1.0.1 → v1.2.0-rc.1

🔹 llm-d/llm-d-workload-variant-autoscaler (New - Experimental)

  • Description: [TODO: Add description of the workload variant autoscaler]
  • History (new):v0.0.8
  • Note: This is an experimental component being included in this release for early testing and feedback.

For more information on any of the component project or versions, please checkout their repos directly. For information on installing and using the new release refer to ourguides. Thank you to all contributors who helped make this happen. Automated release notes will be included below, but it should be noted this only tracks work in the main repo, and does not fully reflect a changelog across the project

What's Changed

New Contributors

Full Changelog:v0.3.1...v0.4.0

Contributors

  • @russellb
  • @clubanderson
  • @petecheslock
  • @poussa
  • @smarterclayton
  • @diego-torres
  • @terrytangyuan
  • @herbertkb
  • @liu-cong
  • @aneeshkp
  • @dannawang0221
  • @Gregory-Pereira
  • @ZhengHongming888
  • @mamy-CS
  • @vMaroon
  • @robertgshaw2-redhat
  • @zetxqx
  • @yangligt2
russellb, clubanderson, and 16 other contributors
Assets2
Loading
reneleonhardt reacted with thumbs up emoji
1 person reacted

[8]ページ先頭

©2009-2025 Movatter.jp