OpenLLM
OpenLLM lets developers run anyopen-source LLMs asOpenAI-compatible API endpoints witha single command.
- 🔬 Build for fast and production usages
- 🚂 Support llama3, qwen2, gemma, etc, and manyquantized versionsfull list
- ⛓️ OpenAI-compatible API
- 💬 Built-in ChatGPT like UI
- 🔥 Accelerated LLM decoding with state-of-the-art inference backends
- 🌥️ Ready for enterprise-grade cloud deployment (Kubernetes, Docker and BentoCloud)
Installation and Setup
Install the OpenLLM package via PyPI:
pip install openllm
LLM
OpenLLM supports a wide range of open-source LLMs as well as serving users' ownfine-tuned LLMs. Useopenllm model
command to see all available models thatare pre-optimized for OpenLLM.
Wrappers
There is a OpenLLM Wrapper which supports interacting with running server with OpenLLM:
from langchain_community.llmsimport OpenLLM
API Reference:OpenLLM
Wrapper for OpenLLM server
This wrapper supports interacting with OpenLLM's OpenAI-compatible endpoint.
To run a model, do:
openllm hello
Wrapper usage:
from langchain_community.llmsimport OpenLLM
llm= OpenLLM(base_url="http://localhost:3000/v1", api_key="na")
llm("What is the difference between a duck and a goose? And why there are so many Goose in Canada?")
API Reference:OpenLLM
Usage
For a more detailed walkthrough of the OpenLLM Wrapper, see theexample notebook