- Notifications
You must be signed in to change notification settings - Fork116
Open
Description
I want a feature in python package and in GUI thatone LLM can process multiple requests with no queue if we have enough hardware resource
I heard llama.cpp has this feature but I could not find this feature in lm studio.
we cannot use AsyncOpenAI in current version, the requests will be queue !
Metadata
Metadata
Assignees
Labels
No labels