Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings
bentoml

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
@bentoml

BentoML

Build fast and reliable model serving systems

github banner

What's cooking? 👩‍🍳

🍱 BentoML: The Unified Serving Framework for AI/ML Systems

BentoML is a Python library for building online serving systems optimized for AI apps and model inference. It supports serving any model format/runtime and custom Python code, offering the key primitives for serving optimizations, task queues, batching, multi-model chains, distributed orchestration, and multi-GPU serving.

🎨 Examples: Learn by doing!

A collection of examples for BentoML, from deploying OpenAI-compatible LLM service, to building voice phone calling agents and RAG applications. Use these examples to learn how to use BentoML and build your own solutions.

🦾 OpenLLM: Self-hosting Large Language Models Made Easy

Run any open-source LLMs (Llama, Mistral, Qwen, Phi and more) or custom fine-tuned models as OpenAI-compatible APIs with a single command. It features a built-in chat UI, state-of-the-art inference performance, and a simplified workflow for production-grade cloud deployment.

☁️ BentoCloud: Unified Inference Platform for any model, on any cloud

BentoCloud is the easist way to build and deploy with BentoML, in our cloud or yours. It brings fast and scalable inference infrastructure into any cloud, allowing AI teams to move 10x faster in building AI applications with ML/AI models, while reducing compute cost - by maxmizing compute utilization, fast GPU autoscaling, minimimal coldstarts and full observability.Sign up today!.

Get in touch 💬

👉Join our Slack community!

👀 Follow us on X@bentomlai andLinkedIn

📖 Read ourblog

PinnedLoading

  1. BentoMLBentoMLPublic

    The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

    Python 8.3k 891

  2. OpenLLMOpenLLMPublic

    Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

    Python 12k 795

  3. YataiYataiPublic

    Model Deployment at Scale on Kubernetes 🦄️

    TypeScript 827 77

  4. comfy-packcomfy-packPublic

    A comprehensive toolkit for reliably locking, packing and deploying environments for ComfyUI workflows.

    Python 194 28

  5. llm-inference-handbookllm-inference-handbookPublic

    Everything you need to know about LLM inference

    TypeScript 247 22

  6. llm-optimizerllm-optimizerPublic

    Benchmark and optimize LLM inference across frameworks with ease

    Python 140 13

Repositories

Loading
Type
Select type
Language
Select language
Sort
Select order
Showing 10 of 116 repositories
  • BentoML Public

    The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

    bentoml/BentoML’s past year of commit activity
    Python 8,273Apache-2.0 891 134 3 UpdatedDec 2, 2025
  • bentoml/openai_emulator’s past year of commit activity
    Python0Apache-2.00 0 0 UpdatedDec 1, 2025
  • OpenLLM Public

    Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

    bentoml/OpenLLM’s past year of commit activity
    Python 11,966Apache-2.0 795 3 3 UpdatedDec 1, 2025
  • llm-inference-handbook Public

    Everything you need to know about LLM inference

    bentoml/llm-inference-handbook’s past year of commit activity
    TypeScript 247Apache-2.0 22 3 1 UpdatedNov 30, 2025
  • BentoVLLM Public

    Self-host LLMs with vLLM and BentoML

    bentoml/BentoVLLM’s past year of commit activity
    Python 160Apache-2.0 20 2 3 UpdatedNov 25, 2025
  • ai-gateway Public Forked fromenvoyproxy/ai-gateway

    Manages Unified Access to Generative AI Services built on Envoy Gateway

    bentoml/ai-gateway’s past year of commit activity
    Go0Apache-2.0 130 0 2 UpdatedNov 20, 2025
  • lago Public Forked fromgetlago/lago

    Open Source Metering and Usage Based Billing API ⭐️ Consumption tracking, Subscription management, Pricing iterations, Payment orchestration & Revenue analytics

    bentoml/lago’s past year of commit activity
    Go0AGPL-3.0 503 0 1 UpdatedNov 19, 2025
  • comfy-pack Public

    A comprehensive toolkit for reliably locking, packing and deploying environments for ComfyUI workflows.

    bentoml/comfy-pack’s past year of commit activity
    Python 194Apache-2.0 28 10 2 UpdatedNov 10, 2025
  • sglang Public Forked fromsgl-project/sglang

    SGLang is a fast serving framework for large language models and vision language models.

    bentoml/sglang’s past year of commit activity
    Python0Apache-2.0 3,593 0 0 UpdatedNov 7, 2025
  • BentoOCR Public

    Turn any OCR models into online inference API endpoint 🚀 🌖

    bentoml/BentoOCR’s past year of commit activity
    Python 57 4 1 0 UpdatedOct 29, 2025

[8]ページ先頭

©2009-2025 Movatter.jp