ray-project/llmval-legacyPublic

NotificationsYou must be signed in to change notification settings
Fork0
Star2

License

Apache-2.0 license

2 stars 0 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
NOTICE.txt		NOTICE.txt
README.md		README.md
analyze-raw.ipynb		analyze-raw.ipynb
env_sample.txt		env_sample.txt
llmval.py		llmval.py
optional.txt		optional.txt
requirements.txt		requirements.txt
sonnet.txt		sonnet.txt

Repository files navigation

llmval

LLMVal is a tool for validating and benchmarking LLMs.

Validation: we send a simple query to the LLM and ensure the returned datais valid. In particular it checks for inter-request cross-over(request A gets the responses for request B).

Benchmarking: LLMVal measures time to first token (TTFT),inter-token latency (ITL) and requests that take longer than 3 secondsto start returning data.

Variation in input and output token lengths is a design parametersince this is intended to be representative. This is becausethere are some optimizations (e.g. continuous batching) thatwe know work better with varying input and output length.

Supported endpoints

Currently supported endpoints include:

Any OpenAI compatible endpoints, including Anyscale Endpoints,Anyscale Private Endpoints, OpenAI, Fireworks, Perplexity etc
Together
Vertex AI
SageMaker

Please seerequirments.txt for more details on dependency requirments.

Upcoming refactor

This is prototype code. We are currently refactoring the code to be moreextensible (including a pluggable endpoints, varying traffic load etc).

In addition we plan to:

Make running the benchmark not only possible fromcommand line, but also possible to integrate easily into CI/CD or job schedulingsystems.
Control where the generated files and information go.
Automate report generation.

We expect this refactor to be complete some time in November 2023.

A note on rate limits

Many LLM providers have extremely low rate limits by default (e.g. Perplexity 3 requests per 90 seconds).

You can use the sleep parameter to overcome these difficulties, but it does affect the representativeness of the results.

Other systems do not have rate limits, but we consider that if the TTFT exceeds 3 second for more than5% of queries that the system is overloaded.

Default values

Default values are the ones that we use for testing Anyscale Endpoints.The distribution of inputs and outputs roughly mirrors the input and outputpatterns we see there.

We recommend setting the seed (or using the provided seed) to reduce variable butstill have randomization.

Do a python llmval.py --help to see all options.

Usage

Provide API base and key in .env file. Check out env_sample.txt
Test out Anyscale Endpoint with following command by sending 20 requests
python llmval.py -r 20 -m "meta-llama/Llama-2-70b-chat-hf"
Control input token numbers by setting min/max lines, and control output token number by setting req-lines and max_tokens
python llmval.py -r 20 -f openai -m "gpt-3.5-turbo" --min-lines 8 --max-lines 10
python llmval.py -r 20 -f openai -m "gpt-3.5-turbo" --req-lines 3 --max-tokens 128
Control sleep between rounds to avoid hitting rate limit
python llmval.py -r 20 -f fireworks -m "accounts/fireworks/models/llama-v2-70b-chat" --sleep 10
Output will be saved atframework-timestamp.json andframework-timestamp_raw.json
Use Jupyter with analyze-raw.ipynb to visualize and/or interact with the raw data.

About

No description, website, or topics provided.

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

llmval

Supported endpoints

Upcoming refactor

A note on rate limits

Default values

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors2

Uh oh!

Languages

Movatterモバイル変換

License

ray-project/llmval-legacy

Folders and files

Latest commit

History

Repository files navigation

llmval

Supported endpoints

Upcoming refactor

A note on rate limits

Default values

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors2

Uh oh!

Languages

Packages