BentoML
Verified
We've verified that the organizationbentoml controls the domain:
- bentoml.com
Sponsor
🍱 BentoML: The Unified Serving Framework for AI/ML Systems
BentoML is a Python library for building online serving systems optimized for AI apps and model inference. It supports serving any model format/runtime and custom Python code, offering the key primitives for serving optimizations, task queues, batching, multi-model chains, distributed orchestration, and multi-GPU serving.
🎨 Examples: Learn by doing!
A collection of examples for BentoML, from deploying OpenAI-compatible LLM service, to building voice phone calling agents and RAG applications. Use these examples to learn how to use BentoML and build your own solutions.
🦾 OpenLLM: Self-hosting Large Language Models Made Easy
Run any open-source LLMs (Llama, Mistral, Qwen, Phi and more) or custom fine-tuned models as OpenAI-compatible APIs with a single command. It features a built-in chat UI, state-of-the-art inference performance, and a simplified workflow for production-grade cloud deployment.
☁️ BentoCloud: Unified Inference Platform for any model, on any cloud
BentoCloud is the easist way to build and deploy with BentoML, in our cloud or yours. It brings fast and scalable inference infrastructure into any cloud, allowing AI teams to move 10x faster in building AI applications with ML/AI models, while reducing compute cost - by maxmizing compute utilization, fast GPU autoscaling, minimimal coldstarts and full observability.Sign up today!.
👀 Follow us on X@bentomlai andLinkedIn
📖 Read ourblog
PinnedLoading
- comfy-pack
comfy-pack PublicA comprehensive toolkit for reliably locking, packing and deploying environments for ComfyUI workflows.
- llm-inference-handbook
llm-inference-handbook PublicEverything you need to know about LLM inference
- llm-optimizer
llm-optimizer PublicBenchmark and optimize LLM inference across frameworks with ease
Repositories
- openai_emulator Public
bentoml/openai_emulator’s past year of commit activity - ai-gateway Public Forked fromenvoyproxy/ai-gateway
Manages Unified Access to Generative AI Services built on Envoy Gateway
bentoml/ai-gateway’s past year of commit activity - lago Public Forked fromgetlago/lago
Open Source Metering and Usage Based Billing API ⭐️ Consumption tracking, Subscription management, Pricing iterations, Payment orchestration & Revenue analytics
bentoml/lago’s past year of commit activity - comfy-pack Public
A comprehensive toolkit for reliably locking, packing and deploying environments for ComfyUI workflows.
bentoml/comfy-pack’s past year of commit activity - sglang Public Forked fromsgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
bentoml/sglang’s past year of commit activity
